Introduction

Introduction

This project was completed as part of a paid internship with the International Water Security Network and was the dissertation project for an MSc in Data Science at the University of the West of England.

The project was written in the R programming language and utilises the shiny library (Chang et al., 2022) to create an interactive web application to provide users with an accessible interface to execute R code on the application backend.

All work was completed independently over a development life cycle of approximately 14 weeks.

Project Aims

The purpose of the project was to create a tool that could enable non-technical users to independently access and utilise raw data.

End users include professionals from the humanitarian sector currently working in response to the refugee crisis in Bangladesh to ensure affected communities receive adequate access to water, sanitation and hygiene facilities.

The tool I have created provides non-technical users with a platform that enables them to independently extract insight from data through a range of analytical and visualisation tools. Automatic data cleansing and an easy to use graphical interface removes the skills barrier that may prevent sector professionals from utilising data to enhance the effectiveness of aid initiatives in the region.

Project Pitch Video

Application Summary

Feature Walkthrough


Live Demo

Click thumbnail to view a live demo version of the application.

demo app link

Note: A sample dataset is pre-loaded for demonstration purposes. File import and password entry widgets are retained for illustration only.


Code Repository

All code is available via my GitHub repository.
Functionning sample data for application use is located in the /data folder.

git hub link


Data, Cleansing & Processing

Data

Socioeconomic Survey

In January 2022, a consortium of specialist practitioners and researchers from the United Nations High Council for Refugees (UNHCR), Oxfam, Ground Water Relief (GWR), Asian University for Women (AUW), and University of the West of England (UWE) collaborated to undertake a detailed socioeconomic survey of water usage in the Teknaf district of Bangladesh.

The survey gathered data from 462 members of the refugee and host communities.

Survey Questions

Survey questionnaire was composed of the following question categories:

questions <- read.csv("../data/questions_table.csv", header =TRUE)
colnames(questions) <- c("Variable Code", "Question Text")
# general
general <- questions[c(1:9),]
# sociodemographic
sd <- questions[grepl("sd", questions[,1]),]
# work
work <- questions[questions$`Variable Code` %in% c("w1_1","w2","w3","w4","w5","w5","w6","w7", "w8"),]
# perceptions
perc <- questions[questions$`Variable Code` %in% c("p1","p2","p3","p4","p5"),]
# water insecurity
hw <- questions[grepl("hw", questions[,1]),]
# water access & utility
wat <- questions[grepl("wat", questions[,1]),]
# livelihood activities
liv <- questions[grepl("liv", questions[,1]),]
# paying for water
pay <- questions[grepl("pay", questions[,1]),]
#wellbeing
wb <- questions[grepl("wb", questions[,1]),]

Sociodemographic

kable(sd, row.names = F) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
Variable Code Question Text
sd1 Head Of Hh
sd2 Relationship Status
sd3 Age
sd4 Household: Responsible To Get Water
sd5_ Household: Ensures Sufficient Water
sd6_ Household: Children Under 18
sd7 Household: Adult Members
sd8 Household: Elderly Members
sd9 Left Myanmar
sd11 Financial Assistance
sd11_1 Financial Assistance: Amount
sd11_2 Financial Assistance: Source
sd11_3 Financial Assistance: Usage
sd12 Highest Level Of Education
sd13 House: Type
sd14 House: No. of Rooms
sd15 House: Has a Garden
sd16 House: Ownership
sd17 House: Electricity Supply
sd18 House: Piped Water Supply
sd19 House: Sewerage Connection
sd20 Primary Fuel Source
sd21 Rate Community: Socioeconomic Standing
sd22 Rate Community: Water Situation

Work

kable(work, row.names = F) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%")
Variable Code Question Text
w1_1 Income Source: Employment In Government
w2 Income Source: Employment In Private Sector
w3 Income Source: Casual Labour
w4 Income Source: Own Business
w5 Income Source: Farming
w6 Income Source: From Family Relative
w7 Difficult To Obtain Sufficient Income
w8 Experienced Problems & Solutions

Perceptions

kable(perc, row.names = F) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%")
Variable Code Question Text
p1 Water&Sanitation Needs Being Met
p2 Women & Men: Equal Responsibility for Sanitation
p3 Women & Men: Equal Awareness of Feedback Processes
p4 Women & Men: Feedback Equally Valued
p5 Women & Men: Equal Awareness of Sanitation Rights

Water Insecurity

kable(hw, row.names = F) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
Variable Code Question Text
hw Calculated HWISE Score
hw1 Worry About Water Supply
hw2 Supply Interruptions
hw2a Supply Interruptions: Expected or Unexpected
hw3 Unable to do Laundry Due to Water Situation
hw4 Schedule Change Due to Water Situation
hw5 Change What Was Eaten Due to Water Situation
hw6 Unable to Wash Hands After Dirty Activity
hw7 Unable to Wash Body Due to Water Situation
hw8 Not Enough Water to Drink
hw9 Felt Anger About Water Situation
hw10 Gone to Sleep Thirsty
hw11 No Useable or Drinkable Water
hw12 Felt Shame About Water Situation
hw13 Asked to Borrow Water

Access & Utility

kable(wat, row.names = F) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
Variable Code Question Text
wat1 Drinking Water: Primary Source
wat2 Drinking Water: Secondary Source
wat3 Non-drinking Water: Primary Source
wat4 Drinking Water: Time to Source (mins)
wat5 Drinking Water: No.of Trips
wat5a Drinking Water: Collection Time Per Week (mins)
wat5b Drinking Water: Collection Time Per Week (grouped)
wat6 Injured While Fetching Water
wat7 Problem With Water Quality
wat8 The Water Quality Problem
wat9 Non-drinking Water: Time to Source (mins)
wat10 Non-drinking Water: No.of Trips
wat11 Treated Water To Make It Safer
wat12_ Drinking Water Treatment
wat12_1 Drinking Water Treatment: Boil
wat12_2 Drinking Water Treatment: Filter
wat12_3 Drinking Water Treatment: Add Chemicals Chlorine Tablets
wat12_4 Drinking Water Treatment: Other Specify
wat12_5 Drinking Water Treatment: Do Nothing
wat12_6_ Drinking Water Treatment: Multiple Methods
wat14 Kolosh
wat14_1 Buckets
wat14_2 Jerry Cans
wat14_3 Other
wat15_ Fewest Water Problems
wat15_1 Fewest Water Problems: January
wat15_2 Fewest Water Problems: February
wat15_3 Fewest Water Problems: March
wat15_4 Fewest Water Problems: April
wat15_5 Fewest Water Problems: May
wat15_6 Fewest Water Problems: June
wat15_7 Fewest Water Problems: July
wat15_8 Fewest Water Problems: August
wat15_9 Fewest Water Problems: September
wat15_10 Fewest Water Problems: October
wat15_11 Fewest Water Problems: November
wat15_12 Fewest Water Problems: December
wat16 Main Cause Of Problems With Water
wat17 What Do You Do When You Dont Have Enough Water

Livelihood

kable(liv, row.names = F) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
Variable Code Question Text
liv1_ Income Generation
liv1_1 Income Generation: Crop Production
liv1_2 Income Generation: Betel Nut Leaf Production
liv1_3 Income Generation: Livestock Production
liv1_4 Income Generation: Salt Production
liv1_5 Income Generation: Sea Fishing
liv1_6 Income Generation: Aquaculture Shrimp Farming
liv1_7 Income Generation: Aquaculture Fish Farming
liv1_8 Income Generation: Other
liv1_9 Income Generation: None Of Above
liv1_10_ Income Generation: Multiple Activities
liv2 Do You Own Businesses
liv3 Income Generation: Primary Water Source
liv4_ Income Generation: Problematic Water Quantity or Quality
liv5 Please Describe Problem
liv6 What You Did When You Experienced These Problems

Paying for Water

kable(pay, row.names = F) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%")
Variable Code Question Text
pay1 Important To Pay To Install Your Own Water System
pay2 Important To Pay For Regular Costs
pay3 Fee Structure For Use Of Main System

Wellbeing

kable(wb, row.names = F) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
Variable Code Question Text
wb1 How Do You Feel About Your Life
wb2 Change Would Make Biggest Positive Difference
wb3 General Health
wb4 No. Days Poor Physical Health
wb5 No. Days Poor Mental Health
wb6 No. Days Health Prevented Normal Activities
wb7 Feel Unable to Control Important Things
wb8 Feel Confident in Ability to Control Problems
wb9 Feel Things Going Your Way
wb10 Feel Difficulties Could Not Be Overcome

Cleansing & Processing

All data cleansing and processing is performed automatically upon successful import of the survey dataset into the application. No interaction or experience in data processing is required from the end user.

The dataset is provided as a password protected, single sheet .xlsx file that is imported using the excel.link library (Demin, 2021).

Data Structures

Four data objects are created in total:

Data Objects

Raw Data

An initial raw_data object is created to store the entire dataset. Only minimal transformations to support indexing are performed:

  • All characters converted to lower case.
  • Column names set as variable codes.
  • Digits appended to ensure column names are unique.
# import data
raw_data <- try(xl.read.file("../data/demo_data.xlsx",
                           header = FALSE,
                           password =  "password"))

# Change all chars to lowercase
raw_data <- data.frame(sapply(raw_data, tolower))

# If row 1 (variable code) is blank, fill with row 2 (filed name)
for (i in 1: ncol(raw_data)) {
      ifelse(is.na(raw_data[1, i]), raw_data[1, i] <- raw_data[2, i ], raw_data[1, i])
    }
    
# Set row 1 as colnames and then remove row 1
colnames(raw_data) <- raw_data[1, ]
raw_data <- raw_data[-1, ]

# Append digits to make colnames unique
colnames(raw_data) <- make.unique(colnames(raw_data), sep = "_")

kable(head(raw_data)) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
submissiondate starttime endtime deviceid subscriberid simid devicephonenum username duration caseid informed_consent informed_consent_1 consent consent_for_taking_photos interviewer date country location union ward village camp block gps-latitude gps-longitude gps-altitude gps-accuracy sex unique_id sd1 sd2 sd3 sd4 sd5 sd5_1 sd5_2 sd5_3 sd5_4 sd5_5 sd5_6 sd6 sd6_1 sd7 sd8 sd9 sd11 sd11_1 sd11_2 sd11_3 sd12 sd13 sd13_1 sd14 sd15 sd16 sd16_1 sd17 sd18 sd19 sd20 sd20_1 sd21 sd22 w1 w1_1 w2 w3 w4 w5 w6 w7 w8 p1 p2 p3 p4 p5 hw1 hw2 hw2a hw3 hw4 hw5 hw6 hw7 hw8 hw9 hw10 hw11 hw12 hw13 wat1 wat2 wat3 wat4 wat5 wat6 wat7 wat8 wat9 wat10 wat11 wat12 wat12_1 wat12_2 wat12_3 wat12_4 wat12_5 wat12_6 wat13 wat13_1 wat13_2 wat14 wat14_1 wat14_2 wat14_3 wat14_4 wat15 wat15_1 wat15_2 wat15_3 wat15_4 wat15_5 wat15_6 wat15_7 wat15_8 wat15_9 wat15_10 wat15_11 wat15_12 wat16 wat17 liv1 liv1_1 liv1_2 liv1_3 liv1_4 liv1_5 liv1_6 liv1_7 liv1_8 liv1_9 liv2 liv3 liv4 liv4_1 liv4_2 liv4_3 liv5 liv6 pay1 pay2 pay3 wb1 wb2 wb3 wb4 wb5 wb6 wb7 wb8 wb9 wb10 dq2 dq2_1 dq2_2 dq2_3 dq2_4 dq2_5 dq2_6 dq3 dq4 dq5 instanceid formdef_version key review_quality review_comments review_corrections
2 submissiondate starttime endtime deviceid subscriberid simid devicephonenum username duration caseid informed_consent informed_consent consent consent_for_taking_photos interviewer date country location union ward village camp block gps-latitude gps-longitude gps-altitude gps-accuracy sex unique_id head_of_hh current_status age responsible_to_get_water make_sure_there_is_enough_water_in_the_house make_sure_there_is_enough_water_in_the_house_self make_sure_there_is_enough_water_in_the_house_spouse make_sure_there_is_enough_water_in_the_house_boy_children make_sure_there_is_enough_water_in_the_house_girl_children make_sure_there_is_enough_water_in_the_house_other_family_members make_sure_there_is_enough_water_in_the_house_shared_responsibility under_18_years_male under_18_years_female above_18_years elderly left_mayanmar financial_assistance_remittances receiving_assistance_remittances assistance_from_what_source assistance_usage highest_level_of_education physical_type_of_housing please_specify how_many_rooms does_the_house_have_a_garden ownership_of_the_house specify- have_electricity_supply piped_water_supply connection_to_sewerage primary_fuel_source_for_heating-cooking describe place_yourself_on_this_ladder place_yourself_on_this_ladder_2 income_source_past_12_months income_source_past_12_months_employment_in_government income_source_past_12_months_employment_in_private_sector_ngo income_source_past_12_months_casual_labour_food_for_work income_source_past_12_months_own_business income_source_past_12_months_farming_agriculture_and_animal income_source_past_12_months_remittances_from_family_relative difficult_to_obtain_sufficient_income what_problems_and_what_you_did water_and_sanitation_needs_are_being_met women_men_share_equal_responsibility women_men_are_equally_aware_of_how_to_provide_feedback feedback_from_women_men_about_local_water_and_sanitation_management_is_equally_valued. women_and_men_are_equally_aware_of_their_water_sanitation_rights last_four_weeks_not_having_enough_water last_four_weeks_main_water_source_been_interrupted supply_was_interrupted_were_these_expected_or_unexpected how_frequently_has_there_not_been_enough_water_to_wash_clothes change_schedules_or_plans_due_to_problems_with_your_water_situation had_to_change_what_was_being_eaten_because had_to_go_without_washing_hands_after_dirty_activities had_to_go_without_washing_their_body has_there_not_been_as_much_water_to_drink you_or_anyone_in_household_feel_angry_about_your_water_situation have_you_or_anyone_in_your_household_gone_to_sleep_thirsty has_there_been_no_usable_or_drinkable_water problems_with_water_caused_you_or_household_member_to_feel_ashamed have_you_or_household_member_asked_to_borrow_water primary_preferred_source_of_drinking_water secondary_second_choice_source_of_drinking_water primary_source_of_water_for_other_purposes_non_drinking_water how_long_does_it_take_to_go_to_the_water_source how_many_trips_to_collect_drinking_water injured_while_fetching_water you_had_problem_with_water_quality the_water_quality_problem how_long_does_it_to_go_to_the_water_source how_many_trips_to_collect_water_for_non_drinking_purposes treated_water_to_make_it_safer primary_way_household_treats_drinking_water primary_way_household_treats_drinking_water_boil primary_way_household_treats_drinking_water_filter primary_way_household_treats_drinking_water_add_chemicals_chlorine_tablets primary_way_household_treats_drinking_water_other_specify primary_way_household_treats_drinking_water_do_nothing specifications okay_for_us_to_take_photos pictures pictures_copy kolosh buckets jerry_cans other specify your_household_experience_the_fewest_water_problems your_household_experience_the_fewest_water_problems_january your_household_experience_the_fewest_water_problems_february your_household_experience_the_fewest_water_problems_march your_household_experience_the_fewest_water_problems_april your_household_experience_the_fewest_water_problems_may your_household_experience_the_fewest_water_problems_june your_household_experience_the_fewest_water_problems_july your_household_experience_the_fewest_water_problems__ your_household_experience_the_fewest_water_problems_september your_household_experience_the_fewest_water_problems_october your_household_experience_the_fewest_water_problems_november your_household_experience_the_fewest_water_problems_december main_cause_of_problems_with_water what_do_you_do_when_you_dont_have_enough_water activities_are_you_engaged_in_for_household_food_production_or_income_generation activities_are_you_engaged_in_for_household_food_production_or_income_generation_crop_production activities_are_you_engaged_in_for_household_food_production_or_income_generation_betel_nut_leaf_production activities_are_you_engaged_in_for_household_food_production_or_income_generation_livestock_production activities_are_you_engaged_in_for_household_food_production_or_income_generation_salt_production activities_are_you_engaged_in_for_household_food_production_or_income_generation_sea_fishing activities_are_you_engaged_in_for_household_food_production_or_income_generation_aquaculture_shrimp_farming activities_are_you_engaged_in_for_household_food_production_or_income_generation_aquaculture_fish_farming activities_are_you_engaged_in_for_household_food_production_or_income_generation_other activities_are_you_engaged_in_for_household_food_production_or_income_generation_none_of_the_above do_you_own_the_businesses primary_source_of_water_for_these_activities experienced_problems_with_the_quantity_and_or_quality experienced_problems_with_the_quantity_and_or_quality_yes_quantity experienced_problems_with_the_quantity_and_or_quality_yes_quality experienced_problems_with_the_quantity_and_or_quality_no please_describe_the_problem what_you_did_when_you_experienced_these_problems important_to_pay_to_install_your_own_water_system important_do_you_think_it_is_for_you_to_pay_for_regular_costs fee_structure_for_use_of_the_main_system how_do_you_feel_about_your_life change_would_make_the_biggest_positive_difference in_general_that_your_health_is how_many_days_during_past_30_days_would_you_say_that_your_physical_health_was_poor how_many_days_during_the_past_30_days_was_your_mental_health_poor how_many_days_during_the_past_30_days_did_poor_physical_or_mental_health_keep_you_from_doing_your_usual_activities how_often_have_you_felt_that_you_were_unable_to_control_the_important_things how_often_have_you_felt_confident_about_your_ability_to_handle how_often_have_you_felt_that_things_were_going_your_way how_often_have_you_felt_difficulties_were_piling_up_so_high_that_you_could_not_overcome did_the_respondent_show_any_of_the_following did_the_respondent_show_any_of_the_following_mistrust_of_you_or_the_study did_the_respondent_show_any_of_the_following_dishonesty_lying did_the_respondent_show_any_of_the_following_fear_of_you_or_the_study did_the_respondent_show_any_of_the_following_hostility_anger_resentment did_the_respondent_show_any_of_the_following_evasion_avoid_answering did_the_respondent_show_any_of_the_following_none_of_the_above any_interruptions_or_distractions overall_assessment_of_the_quality_of_the_data interviewers_comments instanceid formdef_version key review_quality review_comments review_corrections
3 4/17/2022 15:41 4/17/2022 12:20 4/17/2022 13:10 8.64e+14 4.7e+14 8.99e+19 NA mealenumerator 2604 NA 1 1 1 1 faroque 2022-04-17 bangladesh host whykong 6 jimonkhali NA NA 21.0593875 92.2269164 -39.2 5 male 149 myself married 40 my_spouse self spouse boy_children 1 1 1 0 0 0 1 1 2 2 NA 0 NA NA NA primary_school processed_wood NA 3 0 owned NA 1 0 0 wood NA 5 4 own_business 0 0 0 1 0 0 1 business true true true true true 1-2_times 1-2_times unexpected 0_times 0_times 0_times 0_times 0_times 1-2_times 0_times 0_times 0_times 0_times 0_times borehole/tubewell borehole/tubewell borehole/tubewell 2 3 0 1 salty water, iron 2 3 1 filter 0 1 0 0 0 NA 1 https://oxfamrohingyar.surveycto.com/view/submission-attachment/1650178248261.jpg?uuid=uuid%3ab93e131b-e030-4f15-b706-876d53e06d7f https://oxfamrohingyar.surveycto.com/view/submission-attachment/1650178278137.jpg?uuid=uuid%3ab93e131b-e030-4f15-b706-876d53e06d7f 3 4 1 1 drum NA september october 0 0 0 0 0 0 0 0 1 1 0 0 the water is salty water needs to be collected from remote area none_of_the_above 0 0 0 0 0 0 0 0 1 yes_i_own_the_businesses borehole/tubewell yes_quality 0 1 0 because the water is salty, the shop has to collect water from far away 1 / pure water is not easy to drink because water is salty. 2 / there is a problem with cooking for water 3 / water is not easily available for business. somewhat_important not_important_at_all it_is_free somewhat_dissatisfied safe_water_services very_good 3 2 2 fairly_often fairly_often fairly_often fairly_often none_of_the_above 0 0 0 0 0 1 no excellent thank you for taking our information the people of the area will be much happier if they can be provided pure water. uuid:b93e131b-e030-4f15-b706-876d53e06d7f 2204171148 uuid:b93e131b-e030-4f15-b706-876d53e06d7f good

(may 4, 2022 4:06:57 pm): [ submission un-approved. ]

(may 8, 2022 11:23:41 am): [ submission approved. classified as unknown.]
(may 8, 2022 11:23:01 am): assistance_from_what_source, the_water_quality_problem, main_cause_of_problems_with_water, what_do_you_do_when_you_dont_have_enough_water, please_describe_the_problem, what_you_did_when_you_experienced_these_problems, interviewers_comments
4 4/17/2022 15:42 4/17/2022 12:32 4/17/2022 13:13 714132bb35b619f1 NA NA NA mealenumerator 2457 NA 1 1 1 1 ashif 2022-04-17 bangladesh host whykong 8 kharangkhali NA NA 21.04305 92.2335683 -51.8 4.2 female 300 my_spouse married 30 my_spouse self spouse other_family_members 1 1 0 0 1 0 1 2 2 0 NA 0 NA NA NA secondary_school wood/canvas/plastic NA 2 0 owned NA 1 0 0 gas_bottles NA 2 3 own_business 0 0 0 1 0 0 1 covid-19 false true true true true 1-2_times 11-20_times announced/scheduled 11-20_times 11-20_times 3-10_times 11-20_times 11-20_times 3-10_times 11-20_times 3-10_times 3-10_times 11-20_times 11-20_times stand_pipe stand_pipe surface_water_pond_river_lake 120 2 0 1 the taste becomes salty, the color becomes red. 90 3 0 other_specify 0 0 0 1 0 no treatment is required. 1 https://oxfamrohingyar.surveycto.com/view/submission-attachment/1650178612580.jpg?uuid=uuid%3ac76ee428-6b85-4a1c-ae36-da570a7c8ac6 https://oxfamrohingyar.surveycto.com/view/submission-attachment/1650178650383.jpg?uuid=uuid%3ac76ee428-6b85-4a1c-ae36-da570a7c8ac6 4 4 6 drum-01 NA march april 0 0 1 1 0 0 0 0 0 0 0 0 the water level goes down. i will make arrangements to collect water. none_of_the_above 0 0 0 0 0 0 0 0 1 no_i_work_for_somebody_else other yes_quantity yes_quality 1 1 0 in the dry season the water decreases and the water becomes salty and red. need to collect water from others villages very_important very_important fixed_cost_per_month somewhat_satisfied safe_water_services fair 4 5 5 fairly_often fairly_often fairly_often fairly_often none_of_the_above 0 0 0 0 0 1 no excellent no uuid:c76ee428-6b85-4a1c-ae36-da570a7c8ac6 2204171148 uuid:c76ee428-6b85-4a1c-ae36-da570a7c8ac6 good

(may 4, 2022 4:14:25 pm): [ submission un-approved. ]

(may 8, 2022 11:27:31 am): [ submission approved. classified as unknown.]
(may 4, 2022 4:27:15 pm): what_you_did_when_you_experienced_these_problems, please_describe_the_problem, what_do_you_do_when_you_dont_have_enough_water, main_cause_of_problems_with_water, specifications, the_water_quality_problem
5 4/17/2022 15:42 4/17/2022 13:10 4/17/2022 13:48 8.64e+14 4.7e+14 8.99e+19 NA mealenumerator 2062 NA 1 1 1 1 faroque 2022-04-17 bangladesh host whykong 6 jimonkhali NA NA 21.0597728 92.2273266 -41.6 3.9 female 278 my_spouse married 50 my_spouse self spouse other_family_members 1 1 0 0 1 0 0 1 3 2 2017 0 NA NA NA primary_school wood/canvas/plastic NA 2 0 allocated/given_by_authorities NA 0 0 0 gas_bottles NA 4 8 own_business farming_agriculture_and_animal 0 0 0 1 1 0 1 i have to borrow true true true true true 1-2_times 1-2_times announced/scheduled 0_times 1-2_times 3-10_times 0_times 0_times 1-2_times 1-2_times 0_times 0_times 0_times 1-2_times borehole/tubewell borehole/tubewell borehole/tubewell 3 5 0 1 water is a little bit solty 3 7 1 filter 0 1 0 0 0 NA 1 https://oxfamrohingyar.surveycto.com/view/submission-attachment/1650180645879.jpg?uuid=uuid%3abce33d6e-c3b3-4745-9853-94ff7c287234 https://oxfamrohingyar.surveycto.com/view/submission-attachment/1650180661220.jpg?uuid=uuid%3abce33d6e-c3b3-4745-9853-94ff7c287234 2 5 2 1 drum NA july august 0 0 0 0 0 0 1 1 0 0 0 0 water dries up due to overcrowding and overuse. it is difficult to collect water from a distance. none_of_the_above 0 0 0 0 0 0 0 0 1 yes_i_own_the_businesses borehole/tubewell yes_quality 0 1 0 it is very difficult to drink pure water because the water is salty. water has to be collected and brought from far away. not_important_at_all not_important_at_all it_is_free somewhat_satisfied better_health_services very_good 3 2 2 fairly_often almost_never almost_never fairly_often none_of_the_above 0 0 0 0 0 1 no excellent thank you so much for collecting our information. we hope you will solve our problem of pure water. uuid:bce33d6e-c3b3-4745-9853-94ff7c287234 2204171148 uuid:bce33d6e-c3b3-4745-9853-94ff7c287234 good

(may 4, 2022 5:04:07 pm): [ submission un-approved. ]

(may 8, 2022 11:29:01 am): [ submission approved. classified as unknown.]
(may 4, 2022 5:12:45 pm): interviewers_comments, what_you_did_when_you_experienced_these_problems, please_describe_the_problem, what_do_you_do_when_you_dont_have_enough_water, main_cause_of_problems_with_water, the_water_quality_problem, what_problems_and_what_you_did, assistance_usage, assistance_from_what_source
6 4/17/2022 15:42 4/17/2022 13:17 4/17/2022 13:42 714132bb35b619f1 NA NA NA mealenumerator 1497 NA 1 1 1 1 ashif 2022-04-17 bangladesh host whykong 8 purbo moheskhalia para NA NA 21.0433602 92.2336067 -27.1 4.7 female 410 myself widowed 57 myself self boy_children girl_children other_family_members shared_responsibility 1 0 1 1 1 1 1 4 6 1 NA 0 NA NA NA no_formal_education concrete/brick NA 4 1 owned NA 1 1 0 wood NA 4 4 employment_in_private_sector_ngo own_business 0 1 0 1 0 0 1 covid 19 false true true true true 11-20_times 11-20_times announced/scheduled 3-10_times 3-10_times 3-10_times 3-10_times 3-10_times 3-10_times 3-10_times 1-2_times 1-2_times 3-10_times 11-20_times stand_pipe stand_pipe stand_pipe 40 2 0 1 the water tastes salty and the water turns red. 40 2 0 other_specify 0 0 0 1 0 no treatment. 1 https://oxfamrohingyar.surveycto.com/view/submission-attachment/1650181128434.jpg?uuid=uuid%3a18171dc5-d2e5-46dc-a2fd-eb225e616508 https://oxfamrohingyar.surveycto.com/view/submission-attachment/1650181178229.jpg?uuid=uuid%3a18171dc5-d2e5-46dc-a2fd-eb225e616508 3 5 4 2 dram, 2 water pot NA march april 0 0 1 1 0 0 0 0 0 0 0 0 the water level goes down. arranges water supply. none_of_the_above 0 0 0 0 0 0 0 0 1 no_i_work_for_somebody_else stand_pipe yes_quantity yes_quality 1 1 0 the taste of water decreases and the color turns red. i have collected water from the house next door. very_important very_important vc_permonth_on_set_factors somewhat_satisfied safe_water_services good 3 4 6 fairly_often fairly_often fairly_often very_often none_of_the_above 0 0 0 0 0 1 no excellent no uuid:18171dc5-d2e5-46dc-a2fd-eb225e616508 2204171148 uuid:18171dc5-d2e5-46dc-a2fd-eb225e616508 good

(may 4, 2022 5:13:00 pm): [ submission un-approved. ]

(may 8, 2022 11:30:57 am): [ submission approved. classified as unknown.]
(may 4, 2022 5:16:51 pm): what_you_did_when_you_experienced_these_problems, please_describe_the_problem, what_do_you_do_when_you_dont_have_enough_water, main_cause_of_problems_with_water, other, specifications, the_water_quality_problem
7 4/17/2022 15:42 4/17/2022 13:43 4/17/2022 14:05 714132bb35b619f1 NA NA NA mealenumerator 1352 NA 1 1 1 1 ashif 2022-04-17 bangladesh host whykong 8 purbo moheskhalia para NA NA 21.0433432 92.2337799 -48 4.9 female 247 myself married 35 myself self girl_children 1 0 0 1 0 0 3 1 3 2 NA 0 NA NA NA no_formal_education wood/canvas/plastic NA 3 1 owned NA 1 1 0 wood NA 3 3 own_business 0 0 0 1 0 0 0 NA false true true true true 3-10_times 3-10_times announced/scheduled 3-10_times 3-10_times 3-10_times 3-10_times 3-10_times 1-2_times 3-10_times 1-2_times 3-10_times 11-20_times 11-20_times stand_pipe other surface_water_pond_river_lake 30 2 0 1 the taste becomes salty and the color turns red. 120 4 0 other_specify 0 0 0 1 0 no treatment is given 1 https://oxfamrohingyar.surveycto.com/view/submission-attachment/1650182298246.jpg?uuid=uuid%3aa9bc0cc8-1ff4-4f25-bcd5-91465479d019 https://oxfamrohingyar.surveycto.com/view/submission-attachment/1650182353626.jpg?uuid=uuid%3aa9bc0cc8-1ff4-4f25-bcd5-91465479d019 10 10 0 one dram, 15 small bottol NA march april 0 0 1 1 0 0 0 0 0 0 0 0 going down the water level i arrange water supply. none_of_the_above 0 0 0 0 0 0 0 0 1 no_i_work_for_somebody_else stand_pipe yes_quantity yes_quality 1 1 0 the amount decreases and the water turns red. i supply water from the next village. very_important very_important fixed_cost_per_month somewhat_satisfied safe_water_services fair 4 5 5 fairly_often fairly_often fairly_often fairly_often none_of_the_above 0 0 0 0 0 1 no excellent no uuid:a9bc0cc8-1ff4-4f25-bcd5-91465479d019 2204171148 uuid:a9bc0cc8-1ff4-4f25-bcd5-91465479d019 good

(may 4, 2022 5:30:22 pm): [ submission un-approved. ]

(may 8, 2022 11:31:23 am): [ submission approved. classified as unknown.]
(may 4, 2022 5:34:39 pm): what_you_did_when_you_experienced_these_problems, please_describe_the_problem, what_do_you_do_when_you_dont_have_enough_water, main_cause_of_problems_with_water, other, specifications, the_water_quality_problem

Numeric Data

A num_data object is created for use in plotting and visualisation. All survey responses (except free text and factor data) are transformed to the numeric code as defined in the survey.

The following transformations are applied:

  • Variables of no use to application functionality are dropped.
  • Survey responses converted to numeric code.
  • Multiple choice questions encoded as expanded binary responses are collapsed.
  • Data classes corrected.
  • Various instances of inconsistent recording rectified (i.e. sd9: ‘Years since leaving Myanmar’ encoded variously as number of years as well as actual year).
  • Various new variables computed and added (i.e. Total HWISE score calculated from all HW response variables).
# Drop unneeded columns
num_data <- select(raw_data,
             "location",
             "union",
             "ward",
             "village",
             "camp",
             "block",
             "gps-latitude",
             "gps-longitude",
             "sex",
             starts_with(c("sd","w","p","hw","wat","liv","wb")),
             -c("sd5","sd13_1","sd16_1","sd20_1","w1","wat12",
                "wat12_6","wat13","wat13_1","wat13_2","wat14_4","wat15",
                "liv1","liv4")) %>%
  slice(-1)
    
# Replace text responses with numeric codes ----

## True / False
num_data[ , ] <- apply(num_data[ , ], 2, function(x) {
  x = replace(x, which(x=="true"),1)
  x = replace(x, which(x=="false"), 0)
})

## sd questions
num_data[, grepl("sd", names(num_data))] <- apply(num_data[,grepl("sd", names(num_data))], 2, function(x){
  x = replace(x, which(x=="myself"|x=="single"|x=="primary_school"|x=="concrete/brick"|x=="owned"|x=="wood"),1)
  x = replace(x, which(x=="my_spouse"|x=="divorced"|x=="secondary_school"|x=="processed_wood"|x=="rented"|x=="gas_bottles"),2)
  x = replace(x, which(x=="an_adult_child"|x=="adult_child"|x=="widowed"|x=="university_college"|x=="wood/canvas/plastic"|x=="allocated/given_by_authorities"),3)
  x = replace(x, which(x=="a_grandparent"|x=="married"|x=="no_formal_education"),4)
})
### "other" not consistently recorded
num_data[c("sd1","sd4","sd12","sd13")][num_data[c("sd1","sd4","sd12","sd13")] == "other"] <- 5
num_data["sd16"][num_data["sd16"] == "other"] <- 4
num_data["sd20"][num_data["sd20"] == "other"] <- 3

## hw questions
num_data[, grepl("hw", names(num_data))] <- apply(num_data[, grepl("hw",names(num_data))], 2, function(x){
  x = replace(x, which(x=="0_times"|x=="announced/scheduled"), 1)
  x = replace(x, which(x=="1-2_times"|x=="unexpected"), 2)
  x = replace(x, which(x=="3-10_times"), 3)
  x = replace(x, which(x=="11-20_times"), 4)
  x = replace(x, which(x=="more_than_20_times"), 5)
  x = replace(x, which(x=="not_applicable"), 88)
  x = replace(x, which(x=="dont know"), 99)
})

## wat questions
num_data[, grepl("wat", names(num_data))] <- apply(num_data[, grepl("wat",names(num_data))], 2, function(x){
  x = replace(x, which(x=="no"), 0)
  x = replace(x, which(x=="piped_water_to_dwelling"|x=="yes"), 1)
  x = replace(x, which(x=="stand_pipe"|x=="dont know"), 2)
  x = replace(x, which(x=="borehole/tubewell"), 3)
  x = replace(x, which(x=="protected_dug_well"), 4)
  x = replace(x, which(x=="unprotected_dug_well"), 5)
  x = replace(x, which(x=="protected_sping"), 6)
  x = replace(x, which(x=="unprotected_spring"), 7)
  x = replace(x, which(x=="rainwater_collection"), 8)
  x = replace(x, which(x=="small_water_vendor"), 9)
  x = replace(x, which(x=="tanker_truck"), 10)
  x = replace(x, which(x=="bottled_water"), 11)
  x = replace(x, which(x=="bagged_sachet_water"), 12)
  x = replace(x, which(x=="surface_water_pond_river_lake"), 13)
  x = replace(x, which(x=="other_person"), 14)
  x = replace(x, which(x=="other"), 15)
})

## liv questions
num_data[, grepl("liv",names(num_data))] <- apply(num_data[, grepl("liv",names(num_data))], 2, function(x){
  x = replace(x, which(x=="piped_water_to_dwelling"|x=="yes_i_own_the_businesses"), 1)
  x = replace(x, which(x=="stand_pipe"|x=="no_i_work_for_somebody_else"), 2)
  x = replace(x, which(x=="borehole/tubewell"), 3)
  x = replace(x, which(x=="protected_dug_well"), 4)
  x = replace(x, which(x=="unprotected_dug_well"), 5)
  x = replace(x, which(x=="protected_sping"), 6)
  x = replace(x, which(x=="unprotected_spring"), 7)
  x = replace(x, which(x=="rainwater_collection"), 8)
  x = replace(x, which(x=="small_water_vendor"), 9)
  x = replace(x, which(x=="tanker_truck"), 10)
  x = replace(x, which(x=="bottled_water"), 11)
  x = replace(x, which(x=="bagged_sachet_water"), 12)
  x = replace(x, which(x=="surface_water_pond_river_lake"), 13)
  x = replace(x, which(x=="other_person"), 14)
  x = replace(x, which(x=="other"), 15)
  x = replace(x, which(is.na(x)), 88)
  })

## pay questions
num_data[, grepl("pay", names(num_data))] <- apply(num_data[, grepl("pay",names(num_data))], 2, function(x){
  x=replace(x, which(x=="not_important_at_all"|x=="fixed_cost_per_month"),1)
  x=replace(x, which(x=="neither_imp_nor_unimportant"|x=="fixed_cost_per_amount_"),2)
  x=replace(x, which(x=="somewhat_important"|x=="vc_permonth_on_set_factors"),3)
  x=replace(x, which(x=="very_important"|x=="variable_cost_per_amount_or_use"),4)
  x=replace(x, which(x=="no_fixed_payment_is_made"),5)
  x=replace(x, which(x=="it_is_free"),6)
  x=replace(x, which(x=="dont_know"),7)
})

## wb questions
num_data[, grepl("wb", names(num_data))] <- apply(num_data[, grepl("wb",names(num_data))], 2, function(x){
  x = replace(x, which(x=="never"),0)
  x = replace(x, which(x=="very_dissatisfied"|x=="better_health_services"|x=="poor"|x=="almost_never"),1)
  x = replace(x, which(x=="somewhat_dissatisfied"|x=="safe_water_services"|x=="fair"|x=="sometimes"),2)
  x = replace(x, which(x=="somewhat_satisfied"|x=="access_to_education"|x=="good"|x=="fairly_often"),3)
  x = replace(x, which(x=="very_satisfied"|x=="better_communication_with_host_communities"|x=="very_good"|x=="very_often"),4)
  x = replace(x, which(x=="safe_return_to_myanmar"|x=="excellent"),5)
  x = replace(x, which(x=="other"),6)
})

# Define numeric variables ----
num_data <- num_data %>%
  mutate(across(-c("location",
                   "union",
                   "ward",
                   "village",
                   "camp",
                   "block",
                   "sex",
                   "sd11_2",
                   "sd11_3",
                   "w8",
                   "wat8",
                   "wat14_3",
                   "wat16",
                   "wat17",
                   "liv5",
                   "liv6"), as.numeric))
    
# Re-code for consistency ----

## Irregular 'ngo' recording
num_data["sd11_2"][num_data["sd11_2"] == "ngos"] <- "ngo"

## camp names
num_data$camp[grepl("nayapara", num_data$camp)] <- "Nayapara Refugee Camp"
num_data$camp <- sapply(num_data$camp, function(x){
  x = replace(x, which(x=="24"), "Camp 24")
  x = replace(x, which(x=="25"), "Camp 25")
  x = replace(x, which(x=="26"), "Camp 26")
  x = replace(x, which(x=="27"), "Camp 27")
})

## irregular sd9
for(i in 1:nrow(num_data)) {
  if ((!is.na(num_data[i,"sd9"])) & (num_data[i,"sd9"] < 1900)) {num_data[i,"sd9"] = 2022 - num_data[i,"sd9"]}
}
    
# Compute new variables ----

## Overall HWISE score
df_hw <- num_data %>%
  select(starts_with("hw"), -("hw2a"))

num_data$hw <- rowSums(pmin(as.matrix(df_hw - 1), 3))
rm(df_hw)

num_data <- num_data %>% 
  relocate(hw, .before = hw1)
    
## Time to collect per week
num_data <- num_data %>% 
  mutate(wat5a = (wat4 * wat5) * 7) %>%
  relocate(wat5a, .after = wat5)
## Grouped version
num_data$wat5b <- sapply(num_data$wat5a, function(x){
  x = replace(x, which(x<=100), 1)
  x = replace(x, which(x>100 & x<=200), 2)
  x = replace(x, which(x>200 & x<=300), 3)
  x = replace(x, which(x>300 & x<=400), 4)
  x = replace(x, which(x>400 & x<=500), 5)
  x = replace(x, which(x>500 & x<=600), 6)
  x = replace(x, which(x>600), 7)
})
num_data <- num_data %>% relocate(wat5b, .after = wat5a)
    
## Collapse boy/girl under 18 to single var
num_data <- num_data %>% 
  mutate(sd6_ = sd6 + sd6_1) %>%
  relocate(sd6_, .before = sd6) %>%
  select(-c(sd6, sd6_1))
    
# Collapse multiple choice binary questions ----
    
## sd5
# collapse to sd5_ * binaries dropped
num_data <- num_data %>%
  mutate(
    sd5_ = case_when(
      sd5_6 == 1 ~ 6,
      sd5_1 == 1 & sd5_2 == 0 & sd5_3 == 0 & sd5_4 == 0 & sd5_5 == 0 & sd5_6 ==0 ~ 1,
      sd5_1 == 0 & sd5_2 == 1 & sd5_3 == 0 & sd5_4 == 0 & sd5_5 == 0 & sd5_6 ==0 ~ 2,
      sd5_1 == 0 & sd5_2 == 0 & sd5_3 == 1 & sd5_4 == 0 & sd5_5 == 0 & sd5_6 ==0 ~ 3,
      sd5_1 == 0 & sd5_2 == 0 & sd5_3 == 0 & sd5_4 == 1 & sd5_5 == 0 & sd5_6 ==0 ~ 4,
      sd5_1 == 0 & sd5_2 == 0 & sd5_3 == 0 & sd5_4 == 0 & sd5_5 == 1 & sd5_6 ==0 ~ 5,
      TRUE ~ 6)
  ) %>%
  relocate(sd5_, .before = sd5_1) %>%
  select(-c(sd5_1, sd5_2, sd5_3, sd5_4, sd5_5, sd5_6))

## wat12
# collapse in sd12_ * binaries retained
num_data <- num_data %>%
  mutate(
    wat12_ = case_when(
      wat12_1 == 1 & wat12_2 == 0 & wat12_3 == 0 & wat12_4 == 0 & wat12_5 == 0 ~ 1,
      wat12_1 == 0 & wat12_2 == 1 & wat12_3 == 0 & wat12_4 == 0 & wat12_5 == 0 ~ 2,
      wat12_1 == 0 & wat12_2 == 0 & wat12_3 == 1 & wat12_4 == 0 & wat12_5 == 0 ~ 3,
      wat12_1 == 0 & wat12_2 == 0 & wat12_3 == 0 & wat12_4 == 1 & wat12_5 == 0 ~ 4,
      wat12_1 == 0 & wat12_2 == 0 & wat12_3 == 0 & wat12_4 == 0 & wat12_5 == 1 ~ 0,
      TRUE ~ 5
    )
  ) %>%
  relocate(wat12_, .before = wat12_1)
## Create binary wat12_6 "combined methods"
num_data <- num_data %>%
  mutate(
    wat12_6_ = case_when(
      wat12_ == 5 ~ 1,
      TRUE ~ 0
    )
  ) %>%
  relocate(wat12_6_, .after = wat12_5)
    
## wat15
# collapse into wat15_ * binaries retained
for(i in 1:nrow(num_data)) {
  if(num_data[i, "wat15_1"] == 1 & rowSums(num_data[i, c("wat15_2","wat15_3","wat15_4","wat15_5","wat15_6","wat15_7","wat15_8","wat15_9","wat15_10","wat15_11","wat15_12")]) == 0) {num_data[i, "wat15_"] = 1}
  else if(num_data[i, "wat15_2"] == 1 & rowSums(num_data[i, c("wat15_1","wat15_3","wat15_4","wat15_5","wat15_6","wat15_7","wat15_8","wat15_9","wat15_10","wat15_11","wat15_12")]) == 0) {num_data[i, "wat15_"] = 2}
  else if(num_data[i, "wat15_3"] == 1 & rowSums(num_data[i, c("wat15_1","wat15_2","wat15_4","wat15_5","wat15_6","wat15_7","wat15_8","wat15_9","wat15_10","wat15_11","wat15_12")]) == 0) {num_data[i, "wat15_"] = 3}
  else if(num_data[i, "wat15_4"] == 1 & rowSums(num_data[i, c("wat15_1","wat15_2","wat15_3","wat15_5","wat15_6","wat15_7","wat15_8","wat15_9","wat15_10","wat15_11","wat15_12")]) == 0) {num_data[i, "wat15_"] = 4}
  else if(num_data[i, "wat15_5"] == 1 & rowSums(num_data[i, c("wat15_1","wat15_2","wat15_3","wat15_4","wat15_6","wat15_7","wat15_8","wat15_9","wat15_10","wat15_11","wat15_12")]) == 0) {num_data[i, "wat15_"] = 5}
  else if(num_data[i, "wat15_6"] == 1 & rowSums(num_data[i, c("wat15_1","wat15_2","wat15_3","wat15_4","wat15_5","wat15_7","wat15_8","wat15_9","wat15_10","wat15_11","wat15_12")]) == 0) {num_data[i, "wat15_"] = 6}
  else if(num_data[i, "wat15_7"] == 1 & rowSums(num_data[i, c("wat15_1","wat15_2","wat15_3","wat15_4","wat15_5","wat15_6","wat15_8","wat15_9","wat15_10","wat15_11","wat15_12")]) == 0) {num_data[i, "wat15_"] = 7}
  else if(num_data[i, "wat15_8"] == 1 & rowSums(num_data[i, c("wat15_1","wat15_2","wat15_3","wat15_4","wat15_5","wat15_6","wat15_7","wat15_9","wat15_10","wat15_11","wat15_12")]) == 0) {num_data[i, "wat15_"] = 8}
  else if(num_data[i, "wat15_9"] == 1 & rowSums(num_data[i, c("wat15_1","wat15_2","wat15_3","wat15_4","wat15_5","wat15_6","wat15_7","wat15_8","wat15_10","wat15_11","wat15_12")]) == 0) {num_data[i, "wat15_"] = 9}
  else if(num_data[i, "wat15_10"] == 1 & rowSums(num_data[i, c("wat15_1","wat15_2","wat15_3","wat15_4","wat15_5","wat15_6","wat15_7","wat15_8","wat15_9","wat15_11","wat15_12")]) == 0) {num_data[i, "wat15_"] = 10}
  else if(num_data[i, "wat15_11"] == 1 & rowSums(num_data[i, c("wat15_1","wat15_2","wat15_3","wat15_4","wat15_5","wat15_6","wat15_7","wat15_8","wat15_9","wat15_10","wat15_12")]) == 0) {num_data[i, "wat15_"] = 11}
  else if(num_data[i, "wat15_12"] == 1 & rowSums(num_data[i, c("wat15_1","wat15_2","wat15_3","wat15_4","wat15_5","wat15_6","wat15_7","wat15_8","wat15_9","wat15_10","wat15_11")]) == 0) {num_data[i, "wat15_"] = 12}
  else {num_data[i, "wat15_"] = 13}
}
num_data <- num_data %>%
  relocate(wat15_, .before = wat15_1)
    
## liv4
# Collapse into liv4_ * binaries dropped
num_data <- num_data %>%
  mutate(
    liv4_ = case_when(
      liv4_1 == 0 & liv4_2 == 0 & liv4_3 == 1 ~ 0,
      liv4_1 == 1 & liv4_2 == 0 & liv4_3 == 0 ~ 1,
      liv4_1 == 0 & liv4_2 == 1 & liv4_3 == 0 ~ 2,
      TRUE ~ 3
    )
  ) %>%
  relocate(liv4_, .before = liv4_1) %>%
  select(-c(liv4_1, liv4_2, liv4_3))

## liv1
# Collapse into liv1_ * binaries retained
num_data <- num_data %>%
  mutate(
    liv1_ = case_when(
      liv1_1 == 1 & liv1_2 == 0 & liv1_3 == 0 & liv1_4 == 0 & liv1_5 == 0 & liv1_6 == 0 & liv1_7== 0 & liv1_8 == 0 & liv1_9 == 0  ~  1,
      liv1_1 == 0 & liv1_2 == 1 & liv1_3 == 0 & liv1_4 == 0 & liv1_5 == 0 & liv1_6 == 0 & liv1_7== 0 & liv1_8 == 0 & liv1_9 == 0  ~  2,
      liv1_1 == 0 & liv1_2 == 0 & liv1_3 == 1 & liv1_4 == 0 & liv1_5 == 0 & liv1_6 == 0 & liv1_7== 0 & liv1_8 == 0 & liv1_9 == 0  ~  3,
      liv1_1 == 0 & liv1_2 == 0 & liv1_3 == 0 & liv1_4 == 1 & liv1_5 == 0 & liv1_6 == 0 & liv1_7== 0 & liv1_8 == 0 & liv1_9 == 0  ~  4,
      liv1_1 == 0 & liv1_2 == 0 & liv1_3 == 0 & liv1_4 == 0 & liv1_5 == 1 & liv1_6 == 0 & liv1_7== 0 & liv1_8 == 0 & liv1_9 == 0  ~  5,
      liv1_1 == 0 & liv1_2 == 0 & liv1_3 == 0 & liv1_4 == 0 & liv1_5 == 0 & liv1_6 == 1 & liv1_7== 0 & liv1_8 == 0 & liv1_9 == 0  ~  6,
      liv1_1 == 0 & liv1_2 == 0 & liv1_3 == 0 & liv1_4 == 0 & liv1_5 == 0 & liv1_6 == 0 & liv1_7== 1 & liv1_8 == 0 & liv1_9 == 0  ~  7,
      liv1_1 == 0 & liv1_2 == 0 & liv1_3 == 0 & liv1_4 == 0 & liv1_5 == 0 & liv1_6 == 0 & liv1_7== 0 & liv1_8 == 1 & liv1_9 == 0  ~  8,
      liv1_1 == 0 & liv1_2 == 0 & liv1_3 == 0 & liv1_4 == 0 & liv1_5 == 0 & liv1_6 == 0 & liv1_7== 0 & liv1_8 == 0 & liv1_9 == 1  ~  0,
      TRUE ~ 9
    )
  ) %>%
  relocate(liv1_, .before = liv1_1) 
# create binary liv1_10_ "multiple income sources"
num_data <- num_data %>%
  mutate(
    liv1_10_ = case_when(
      liv1_ == 9 ~ 1,
      TRUE ~ 0
    )
  ) %>% 
  relocate(liv1_10_, .after = liv1_9)
    
# Replace NA vals ----
num_data[is.na(num_data)] <- "NA"

kable(head(num_data)) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
location union ward village camp block gps-latitude gps-longitude sex sd1 sd2 sd3 sd4 sd5_ sd6_ sd7 sd8 sd9 sd11 sd11_1 sd11_2 sd11_3 sd12 sd13 sd14 sd15 sd16 sd17 sd18 sd19 sd20 sd21 sd22 w1_1 w2 w3 w4 w5 w6 w7 w8 wat1 wat2 wat3 wat4 wat5 wat5a wat5b wat6 wat7 wat8 wat9 wat10 wat11 wat12_ wat12_1 wat12_2 wat12_3 wat12_4 wat12_5 wat12_6_ wat14 wat14_1 wat14_2 wat14_3 wat15_ wat15_1 wat15_2 wat15_3 wat15_4 wat15_5 wat15_6 wat15_7 wat15_8 wat15_9 wat15_10 wat15_11 wat15_12 wat16 wat17 wb1 wb2 wb3 wb4 wb5 wb6 wb7 wb8 wb9 wb10 p1 p2 p3 p4 p5 pay1 pay2 pay3 hw hw1 hw2 hw2a hw3 hw4 hw5 hw6 hw7 hw8 hw9 hw10 hw11 hw12 hw13 liv1_ liv1_1 liv1_2 liv1_3 liv1_4 liv1_5 liv1_6 liv1_7 liv1_8 liv1_9 liv1_10_ liv2 liv3 liv4_ liv5 liv6
host whykong 6 jimonkhali NA NA 21.05939 92.22692 male 1 4 40 2 6 2 2 2 NA 0 NA NA NA 1 2 3 0 1 1 0 0 1 5 4 0 0 0 1 0 0 1 business 3 3 3 2 3 42 1 0 1 salty water, iron 2 3 1 2 0 1 0 0 0 0 3 4 1 1 drum 13 0 0 0 0 0 0 0 0 1 1 0 0 the water is salty water needs to be collected from remote area 2 2 4 3 2 2 3 3 3 3 1 1 1 1 1 3 1 6 3 2 2 2 1 1 1 1 1 2 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 0 1 3 2 because the water is salty, the shop has to collect water from far away 1 / pure water is not easy to drink because water is salty. 2 / there is a problem with cooking for water 3 / water is not easily available for business.
host whykong 8 kharangkhali NA NA 21.04305 92.23357 female 2 4 30 2 6 3 2 0 NA 0 NA NA NA 2 3 2 0 1 1 0 0 2 2 3 0 0 0 1 0 0 1 covid-19 2 2 13 120 2 1680 7 0 1 the taste becomes salty, the color becomes red. 90 3 0 4 0 0 0 1 0 0 4 4 6 drum-01 13 0 0 1 1 0 0 0 0 0 0 0 0 the water level goes down. i will make arrangements to collect water. 3 2 2 4 5 5 3 3 3 3 0 1 1 1 1 4 4 1 33 2 4 1 4 4 3 4 4 3 4 3 3 4 4 0 0 0 0 0 0 0 0 0 1 0 2 15 3 in the dry season the water decreases and the water becomes salty and red. need to collect water from others villages
host whykong 6 jimonkhali NA NA 21.05977 92.22733 female 2 4 50 2 6 1 3 2 2017 0 NA NA NA 1 3 2 0 3 0 0 0 2 4 8 0 0 0 1 1 0 1 i have to borrow 3 3 3 3 5 105 2 0 1 water is a little bit solty 3 7 1 2 0 1 0 0 0 0 2 5 2 1 drum 13 0 0 0 0 0 0 1 1 0 0 0 0 water dries up due to overcrowding and overuse. it is difficult to collect water from a distance. 3 1 4 3 2 2 3 1 1 3 1 1 1 1 1 1 1 6 8 2 2 1 1 2 3 1 1 2 2 1 1 1 2 0 0 0 0 0 0 0 0 0 1 0 1 3 2 it is very difficult to drink pure water because the water is salty. water has to be collected and brought from far away.
host whykong 8 purbo moheskhalia para NA NA 21.04336 92.23361 female 1 3 57 1 6 5 6 1 NA 0 NA NA NA 4 1 4 1 1 1 1 0 1 4 4 0 1 0 1 0 0 1 covid 19 2 2 2 40 2 560 6 0 1 the water tastes salty and the water turns red. 40 2 0 4 0 0 0 1 0 0 3 5 4 2 dram, 2 water pot 13 0 0 1 1 0 0 0 0 0 0 0 0 the water level goes down. arranges water supply. 3 2 3 3 4 6 3 3 3 4 0 1 1 1 1 4 4 3 27 4 4 1 3 3 3 3 3 3 3 2 2 3 4 0 0 0 0 0 0 0 0 0 1 0 2 2 3 the taste of water decreases and the color turns red. i have collected water from the house next door.
host whykong 8 purbo moheskhalia para NA NA 21.04334 92.23378 female 1 4 35 1 6 4 3 2 NA 0 NA NA NA 4 3 3 1 1 1 1 0 1 3 3 0 0 0 1 0 0 0 NA 2 15 13 30 2 420 5 0 1 the taste becomes salty and the color turns red. 120 4 0 4 0 0 0 1 0 0 10 10 0 one dram, 15 small bottol 13 0 0 1 1 0 0 0 0 0 0 0 0 going down the water level i arrange water supply. 3 2 2 4 5 5 3 3 3 3 0 1 1 1 1 4 4 1 26 3 3 1 3 3 3 3 3 2 3 2 3 4 4 0 0 0 0 0 0 0 0 0 1 0 2 2 3 the amount decreases and the water turns red. i supply water from the next village.
host whykong 6 jimonkhali NA NA 21.05931 92.22797 male 1 4 20 1 6 0 2 0 1991 0 NA NA NA 4 3 1 0 3 0 0 0 2 5 3 0 0 0 0 1 0 1 i worked in agriculture and had accumulated rice and ate them. and some have had to borrow. 3 3 3 7 11 539 6 0 0 NA 7 8 0 4 0 0 0 1 0 0 4 4 4 4 water jar 13 0 0 0 0 0 0 1 1 1 0 0 0 water naturally becomes salt. it is difficult to collect water from far away. 2 1 4 6 5 7 3 3 4 3 1 1 1 1 1 4 4 6 18 4 4 2 3 3 2 2 2 2 3 2 2 1 1 9 1 0 0 0 1 0 0 0 0 1 1 3 3 due to the low water content and the addition of water and salt, pure water and cooking are very difficult. we were bound to use this salty water.

Question Data

A q_data object is created to store the question text for all items in the survey. The relevant row is extracted from raw_data and matched to the contents of num_data. String manipulation is performed using the stringR library (Wickham, 2022) including the following:

  • Capitalisation correction.
  • Special character removal.
  • Modifications to ensure brevity, consistency and enhance readability.
#extract question row
q_data <- raw_data[1, ]
    
# Match cols to num_data
q_data <- q_data[, colnames(q_data) %in% colnames(num_data)]

# String formatting
q_data <- q_data %>% mutate(across(everything(), 
                              ~ str_replace(., "income_source_past_12_months_", "income_source:_") %>%
                                str_replace("primary_way_household_treats_drinking_water","drinking_water_treatment:") %>%
                                str_replace("your_household_experience_the_fewest_water_problems", "fewest_water_problems:") %>%
                                str_replace("activities_are_you_engaged_in_for_household_food_production_or_income_generation", "income_generation:") %>%
                                str_replace("_are_","_") %>%
                                str_remove("_remittances") %>%
                                str_replace("_the_","_") %>%
                                str_replace("receiving_assistance","financial assistance: amount") %>%
                                str_replace("assistance_from_what_source", "financial assistance: source") %>%
                                str_replace("assistance_usage", "financial assistance: usage") %>%
                                str_remove("_for_heating-cooking")  %>%
                                str_remove("_food_for_work") %>%
                                str_remove("_agriculture_and_animal") %>%
                                str_remove("you_had_") %>%
                                str_remove("ngo") %>%
                                str_remove("do_you_think_it_is_for_you_") %>%
                                str_replace_all("_and_","&") %>%
                                str_replace_all("_"," ") %>%
                                str_to_title)) %>% 
  mutate(location = "Refugee Camp or Host Community",
         sd2 = "Relationship Status",
         sd4 = "Household: Responsible To Get Water",
         sd7 = "Household: Adult Members",
         sd8 = "Household: Elderly Members",
         sd9 = "Left Myanmar",
         sd13 = "House: Type",
         sd14 = "House: No. of Rooms",
         sd15 = "House: Has a Garden",
         sd16 = "House: Ownership",
         sd17 = "House: Electricity Supply",
         sd18 = "House: Piped Water Supply",
         sd19 = "House: Sewerage Connection",
         sd21 = "Rate Community: Socioeconomic Standing",
         sd22 = "Rate Community: Water Situation",
         w8 = "Experienced Problems & Solutions",
         p2 = "Women & Men: Equal Responsibility for Sanitation",
         p3 = "Women & Men: Equal Awareness of Feedback Processes",
         p4 = "Women & Men: Feedback Equally Valued",
         p5 = "Women & Men: Equal Awareness of Sanitation Rights",
         hw1 = "Worry About Water Supply",
         hw2 = "Supply Interruptions",
         hw2a = "Supply Interruptions: Expected or Unexpected",
         hw3 = "Unable to do Laundry Due to Water Situation",
         hw4 = "Schedule Change Due to Water Situation",
         hw5 = "Change What Was Eaten Due to Water Situation",
         hw6 = "Unable to Wash Hands After Dirty Activity",
         hw7 = "Unable to Wash Body Due to Water Situation",
         hw8 = "Not Enough Water to Drink",
         hw9 = "Felt Anger About Water Situation",
         hw10 = "Gone to Sleep Thirsty",
         hw11 = "No Useable or Drinkable Water",
         hw12 = "Felt Shame About Water Situation",
         hw13 = "Asked to Borrow Water",
         wat1 = "Drinking Water: Primary Source",
         wat2 = "Drinking Water: Secondary Source",
         wat3 = "Non-drinking Water: Primary Source",
         wat4 = "Drinking Water: Time to Source (mins)",
         wat5 = "Drinking Water: No.of Trips",
         wat9 = "Non-drinking Water: Time to Source (mins)",
         wat10 = "Non-drinking Water: No.of Trips",
         wat15_8 = "Fewest Water Problems: August",
         liv3 = "Income Generation: Primary Water Source",
         wb3 = "General Health",
         wb4 = "No. Days Poor Physical Health",
         wb5 = "No. Days Poor Mental Health",
         wb6 = "No. Days Health Prevented Normal Activities",
         wb7 = "Feel Unable to Control Important Things",
         wb8 = "Feel Confident in Ability to Control Problems",
         wb9 = "Feel Things Going Your Way",
         wb10 = "Feel Difficulties Could Not Be Overcome"
         )
    
# Add newly created variables
q_data <- q_data %>% mutate(
  sd5_ = "Household: Ensures Sufficient Water",
  sd6_ = "Household: Children Under 18",
  wat12_ = "Drinking Water Treatment",
  wat12_6_ = "Drinking Water Treatment: Multiple Methods",
  wat15_ = "Fewest Water Problems",
  hw = "Calculated HWISE Score",
  liv1_ = "Income Generation",
  liv1_10_ = "Income Generation: Multiple Activities",
  liv4_ = "Income Generation: Problematic Water Quantity or Quality",
  wat5a = "Drinking Water: Collection Time Per Week (mins)",
  wat5b = "Drinking Water: Collection Time Per Week (grouped)"
) %>%
  relocate(sd5_, .after = sd4) %>%
  relocate(sd6_, .after = sd5_) %>%
  relocate(wat12_, .before = wat12_1) %>%
  relocate(wat12_6_, .after = wat12_5) %>%
  relocate(wat15_, .before = wat15_1) %>%
  relocate(hw, .before = hw1) %>%
  relocate(liv1_, .before = liv1_1) %>%
  relocate(liv1_10_, .after = liv1_9) %>%
  relocate(liv4_, .after = liv3) %>%
  relocate(wat5a, .after = wat5) %>%
  relocate(wat5b, .after = wat5a)

kable(head(q_data)) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
location union ward village camp block gps-latitude gps-longitude sex sd1 sd2 sd3 sd4 sd5_ sd6_ sd7 sd8 sd9 sd11 sd11_1 sd11_2 sd11_3 sd12 sd13 sd14 sd15 sd16 sd17 sd18 sd19 sd20 sd21 sd22 w1_1 w2 w3 w4 w5 w6 w7 w8 p1 p2 p3 p4 p5 hw hw1 hw2 hw2a hw3 hw4 hw5 hw6 hw7 hw8 hw9 hw10 hw11 hw12 hw13 wat1 wat2 wat3 wat4 wat5 wat5a wat5b wat6 wat7 wat8 wat9 wat10 wat11 wat12_ wat12_1 wat12_2 wat12_3 wat12_4 wat12_5 wat12_6_ wat14 wat14_1 wat14_2 wat14_3 wat15_ wat15_1 wat15_2 wat15_3 wat15_4 wat15_5 wat15_6 wat15_7 wat15_8 wat15_9 wat15_10 wat15_11 wat15_12 wat16 wat17 liv1_ liv1_1 liv1_2 liv1_3 liv1_4 liv1_5 liv1_6 liv1_7 liv1_8 liv1_9 liv1_10_ liv2 liv3 liv4_ liv5 liv6 pay1 pay2 pay3 wb1 wb2 wb3 wb4 wb5 wb6 wb7 wb8 wb9 wb10
2 Refugee Camp or Host Community Union Ward Village Camp Block Gps-Latitude Gps-Longitude Sex Head Of Hh Relationship Status Age Household: Responsible To Get Water Household: Ensures Sufficient Water Household: Children Under 18 Household: Adult Members Household: Elderly Members Left Myanmar Financial Assistance Financial Assistance: Amount Financial Assistance: Source Financial Assistance: Usage Highest Level Of Education House: Type House: No. of Rooms House: Has a Garden House: Ownership House: Electricity Supply House: Piped Water Supply House: Sewerage Connection Primary Fuel Source Rate Community: Socioeconomic Standing Rate Community: Water Situation Income Source: Employment In Government Income Source: Employment In Private Sector Income Source: Casual Labour Income Source: Own Business Income Source: Farming Income Source: From Family Relative Difficult To Obtain Sufficient Income Experienced Problems & Solutions Water&Sanitation Needs Being Met Women & Men: Equal Responsibility for Sanitation Women & Men: Equal Awareness of Feedback Processes Women & Men: Feedback Equally Valued Women & Men: Equal Awareness of Sanitation Rights Calculated HWISE Score Worry About Water Supply Supply Interruptions Supply Interruptions: Expected or Unexpected Unable to do Laundry Due to Water Situation Schedule Change Due to Water Situation Change What Was Eaten Due to Water Situation Unable to Wash Hands After Dirty Activity Unable to Wash Body Due to Water Situation Not Enough Water to Drink Felt Anger About Water Situation Gone to Sleep Thirsty No Useable or Drinkable Water Felt Shame About Water Situation Asked to Borrow Water Drinking Water: Primary Source Drinking Water: Secondary Source Non-drinking Water: Primary Source Drinking Water: Time to Source (mins) Drinking Water: No.of Trips Drinking Water: Collection Time Per Week (mins) Drinking Water: Collection Time Per Week (grouped) Injured While Fetching Water Problem With Water Quality The Water Quality Problem Non-drinking Water: Time to Source (mins) Non-drinking Water: No.of Trips Treated Water To Make It Safer Drinking Water Treatment Drinking Water Treatment: Boil Drinking Water Treatment: Filter Drinking Water Treatment: Add Chemicals Chlorine Tablets Drinking Water Treatment: Other Specify Drinking Water Treatment: Do Nothing Drinking Water Treatment: Multiple Methods Kolosh Buckets Jerry Cans Other Fewest Water Problems Fewest Water Problems: January Fewest Water Problems: February Fewest Water Problems: March Fewest Water Problems: April Fewest Water Problems: May Fewest Water Problems: June Fewest Water Problems: July Fewest Water Problems: August Fewest Water Problems: September Fewest Water Problems: October Fewest Water Problems: November Fewest Water Problems: December Main Cause Of Problems With Water What Do You Do When You Dont Have Enough Water Income Generation Income Generation: Crop Production Income Generation: Betel Nut Leaf Production Income Generation: Livestock Production Income Generation: Salt Production Income Generation: Sea Fishing Income Generation: Aquaculture Shrimp Farming Income Generation: Aquaculture Fish Farming Income Generation: Other Income Generation: None Of Above Income Generation: Multiple Activities Do You Own Businesses Income Generation: Primary Water Source Income Generation: Problematic Water Quantity or Quality Please Describe Problem What You Did When You Experienced These Problems Important To Pay To Install Your Own Water System Important To Pay For Regular Costs Fee Structure For Use Of Main System How Do You Feel About Your Life Change Would Make Biggest Positive Difference General Health No. Days Poor Physical Health No. Days Poor Mental Health No. Days Health Prevented Normal Activities Feel Unable to Control Important Things Feel Confident in Ability to Control Problems Feel Things Going Your Way Feel Difficulties Could Not Be Overcome

Text Data

A text_data object is created to store all data as readable text responses. This dataset is called to render intelligible labels and axis ticks during plot and visualisation creation. Data is read from the original Excel sheet and the following transformations applied:

  • Columns matched to num_data content.
  • Numeric response data converted to intelligible text responses.
  • String formatting including:
    • Capitalisation correction.
    • Special character removal.
    • Modifications to ensure brevity, consistency and enhance readability.
# Read Excel
text_data <- try(xl.read.file("../data/demo_data.xlsx",
                 header = FALSE,
                 password =  "password",
                 top.left.cell = "A3")) # ignores row 1 & 2 (chr), ensures single accurate class per col

# Set colnames as question codes
colnames(text_data) <- colnames(raw_data)

# match cols to data in clean.data
text_data <- text_data[, colnames(text_data) %in% colnames(num_data)]

# Add newly created variables
`%!in%` <- Negate(`%in%`) # create 'not in' function
text_data <- bind_cols(text_data, num_data[, colnames(num_data) %!in% colnames(text_data)])
text_data <- text_data %>%
  relocate(sd5_, .after = sd4) %>%
  relocate(sd6_, .after = sd5_) %>%
  relocate(wat12_, .before = wat12_1) %>%
  relocate(wat12_6_, .after = wat12_5) %>%
  relocate(wat15_, .before = wat15_1) %>%
  relocate(hw, .before = hw1) %>%
  relocate(liv1_, .before = liv1_1) %>%
  relocate(liv1_10_, .after = liv1_9) %>%
  relocate(liv4_, .after = liv3) %>%
  relocate(wat5a, .after = wat5) %>%
  relocate(wat5b, .after = wat5a)

# Insert correct text responses
text_data$camp <- num_data$camp
text_data$sd11_2 <- num_data$sd11_2

text_data$sd5_ <- sapply(text_data$sd5_, function(x) {
  switch(x, "Self", "Spouse", "Children: Boy", "Children: Girl", "Other", "Shared Responsibility")})

text_data$sd9 <- num_data$sd9

## Define a list of all binary questions to format together
binary_q <- c("sd11","sd15","sd17","sd18","sd19","w1_1","w2","w3","w4","w5",
              "w6","w7","wat6","wat7","wat11","wat12_1","wat12_2","wat12_3","wat12_4",
              "wat12_5","wat12_6_","wat15_1","wat15_2","wat15_3","wat15_4","wat15_5","wat15_6",
              "wat15_7","wat15_8","wat15_9","wat15_10","wat15_11","wat15_12","liv1_1","liv1_2",
              "liv1_3","liv1_4","liv1_5","liv1_6","liv1_7","liv1_8","liv1_9","liv1_10_","liv2")
## format all binaries
text_data[binary_q] <- lapply(text_data[binary_q], function(x){
  x = replace(x, which(x==0),"No")
  x = replace(x, which(x==1),"Yes")
  x = replace(x, which(x==2),"Don't Know")
  x = replace(x, which(x==88),"NA")})

text_data$sd20 <- sapply(text_data$sd20, function(x) {
  x = replace(x, which(x==1),"Wood")
  x = replace(x, which(x==2),"Gas Bottles")
  x = replace(x, which(x==3),"Other")}) 

text_data[c("wat1","wat2","wat3","liv3")] <- lapply(text_data[c("wat1","wat2","wat3","liv3")], function(x){
  x = replace(x, which(x==1),"Piped Supply")
  x = replace(x, which(x==2),"Stand Pipe")
  x = replace(x, which(x==3),"Borehole/Tubewell")
  x = replace(x, which(x==4),"Dug Well: Protected")
  x = replace(x, which(x==5),"Dug Well: Unrotected")
  x = replace(x, which(x==6),"Spring: Protected")
  x = replace(x, which(x==7),"Spring: Unprotected")
  x = replace(x, which(x==8),"Rainwater Collection")
  x = replace(x, which(x==9),"Small Water Vendor")
  x = replace(x, which(x==10),"Tanker Truck")
  x = replace(x, which(x==11),"Bottled Water")
  x = replace(x, which(x==12),"Sachet Water")
  x = replace(x, which(x==13),"Surface Water/Pond/River/Lake")
  x = replace(x, which(x==14),"Other Person")
  x = replace(x, which(x==15),"Other")
  x = replace(x, which(is.na(x)),"NA")
})
## Change to grouped values 
text_data$wat5b <- sapply(text_data$wat5b, function(x){
  x = replace(x, which(x==1), "0-100 mins")
  x = replace(x, which(x==2), "101-200 mins")
  x = replace(x, which(x==3), "201-300 mins")
  x = replace(x, which(x==4), "301-400 mins")
  x = replace(x, which(x==5), "401-500 mins")
  x = replace(x, which(x==6), "501-600 mins")
  x = replace(x, which(x==7), "600+ mins")
})

text_data$wat12_ <- sapply(text_data$wat12_, function(x){
  switch(x+1, "None","Boil","Filter","Add Chemicals/Chlorine","Other","Combined Methods")})

text_data$wat15_ <- sapply(text_data$wat15_, function(x) {
  switch(x, "Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec","Multiple Months")})

text_data$liv1_ <- sapply(text_data$liv1_, function(x) {
  switch(x+1, "None of the Above", "Crop Production","Betel Nut/Leaf Production",
         "Livestock Production","Salt Production","Sea Fishing","Shrimp Farming",
         "Fish Farming","Other","Multiple Activities")})

text_data$liv4_ <- sapply(text_data$liv4_, function(x){
  switch(x+1, "No","Quantity","Quality","Both Quantity & Quality")})

text_data[, grepl("^hw.*\\d$", names(text_data))] <- apply(text_data[, grepl("^hw.*\\d$", names(text_data))], 2, function(x){
  x = replace(x, which(x=="0_times"), "Never")
  x = replace(x, which(x=="1-2_times"), "Rarely")
  x = replace(x, which(x=="3-10_times"), "Sometimes")
  x = replace(x, which(x=="11-20_times"), "Often")
  x = replace(x, which(x=="more_than_20_times"), "Always")
  x = replace(x, which(x=="dont_know"), "Don't Know")
})

text_data[, grepl("pay", names(text_data))] <- apply(text_data[,grepl("pay", names(text_data))], 2, function(x){
  x = replace(x, which(x=="Fixed_cost_per_amount"), "Fixed: Per Amount")
  x = replace(x, which(x=="Fixed_cost_per_month"), "Fixed: Per Month")
  x = replace(x, which(x=="VC_permonth_on_set_factors"), "Variable: Set Factors")
  x = replace(x, which(x=="Variable_cost_per_amount_or_use"), "Variable: Per Amount")
  x = replace(x, which(x=="no_fixed_payment_is_made"), "No Fixed Payment")
  x = replace(x, which(x=="it_is_free"), "Free")
})    

text_data <-  text_data %>% mutate(across(where(is.character),
                            ~ str_replace_all(., "_and_", "_&_") %>%
                              str_replace_all("_", " ") %>%
                              str_to_title)) %>% mutate(across(where(is.logical), str_to_title))

text_data <- text_data %>% mutate(across(c("w8","wat8","wat16","wat17","liv6","liv5"), 
                ~ str_replace_all(., "\n", " ") %>%
                  str_squish %>%
                  str_to_sentence))

text_data[is.na(text_data)] <- "NA"

kable(head(text_data)) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
location union ward village camp block gps-latitude gps-longitude sex sd1 sd2 sd3 sd4 sd5_ sd6_ sd7 sd8 sd9 sd11 sd11_1 sd11_2 sd11_3 sd12 sd13 sd14 sd15 sd16 sd17 sd18 sd19 sd20 sd21 sd22 w1_1 w2 w3 w4 w5 w6 w7 w8 p1 p2 p3 p4 p5 hw hw1 hw2 hw2a hw3 hw4 hw5 hw6 hw7 hw8 hw9 hw10 hw11 hw12 hw13 wat1 wat2 wat3 wat4 wat5 wat5a wat5b wat6 wat7 wat8 wat9 wat10 wat11 wat12_ wat12_1 wat12_2 wat12_3 wat12_4 wat12_5 wat12_6_ wat14 wat14_1 wat14_2 wat14_3 wat15_ wat15_1 wat15_2 wat15_3 wat15_4 wat15_5 wat15_6 wat15_7 wat15_8 wat15_9 wat15_10 wat15_11 wat15_12 wat16 wat17 liv1_ liv1_1 liv1_2 liv1_3 liv1_4 liv1_5 liv1_6 liv1_7 liv1_8 liv1_9 liv1_10_ liv2 liv3 liv4_ liv5 liv6 pay1 pay2 pay3 wb1 wb2 wb3 wb4 wb5 wb6 wb7 wb8 wb9 wb10
Host Whykong 6 Jimonkhali Na NA 21.05939 92.22692 Male Myself Married 40 My Spouse Shared Responsibility 2 2 2 Na No NA Na NA Primary School Processed Wood 3 No Owned Yes No No Wood 5 4 No No No Yes No No Yes Business True True True True True 3 Rarely Rarely Unexpected Never Never Never Never Never Rarely Never Never Never Never Never Borehole/Tubewell Borehole/Tubewell Borehole/Tubewell 2 3 42 0-100 Mins No Yes Salty water, iron 2 3 Yes Filter No Yes No No No No 3 4 1 1 Drum Multiple Months No No No No No No No No Yes Yes No No The water is salty Water needs to be collected from remote area None Of The Above No No No No No No No No Yes No Yes I Own The Businesses Borehole/Tubewell Quality Because the water is salty, the shop has to collect water from far away 1 / pure water is not easy to drink because water is salty. 2 / there is a problem with cooking for water 3 / water is not easily available for business. Somewhat Important Not Important At All Free Somewhat Dissatisfied Safe Water Services Very Good 3 2 2 Fairly Often Fairly Often Fairly Often Fairly Often
Host Whykong 8 Kharangkhali Na NA 21.04305 92.23357 Female My Spouse Married 30 My Spouse Shared Responsibility 3 2 0 Na No NA Na NA Secondary School Wood/Canvas/Plastic 2 No Owned Yes No No Gas Bottles 2 3 No No No Yes No No Yes Covid-19 False True True True True 33 Rarely Often Announced/Scheduled Often Often Sometimes Often Often Sometimes Often Sometimes Sometimes Often Often Stand Pipe Stand Pipe Surface Water Pond River Lake 120 2 1680 600+ Mins No Yes The taste becomes salty, the color becomes red. 90 3 No Other No No No Yes No No 4 4 6 Drum-01 Multiple Months No No Yes Yes No No No No No No No No The water level goes down. I will make arrangements to collect water. None Of The Above No No No No No No No No Yes No No I Work For Somebody Else Other Both Quantity & Quality In the dry season the water decreases and the water becomes salty and red. Need to collect water from others villages Very Important Very Important Fixed: Per Month Somewhat Satisfied Safe Water Services Fair 4 5 5 Fairly Often Fairly Often Fairly Often Fairly Often
Host Whykong 6 Jimonkhali Na NA 21.05977 92.22733 Female My Spouse Married 50 My Spouse Shared Responsibility 1 3 2 2017 No NA Na NA Primary School Wood/Canvas/Plastic 2 No Allocated/Given By Authorities No No No Gas Bottles 4 8 No No No Yes Yes No Yes I have to borrow True True True True True 8 Rarely Rarely Announced/Scheduled Never Rarely Sometimes Never Never Rarely Rarely Never Never Never Rarely Borehole/Tubewell Borehole/Tubewell Borehole/Tubewell 3 5 105 101-200 Mins No Yes Water is a little bit solty 3 7 Yes Filter No Yes No No No No 2 5 2 1 Drum Multiple Months No No No No No No Yes Yes No No No No Water dries up due to overcrowding and overuse. It is difficult to collect water from a distance. None Of The Above No No No No No No No No Yes No Yes I Own The Businesses Borehole/Tubewell Quality It is very difficult to drink pure water because the water is salty. Water has to be collected and brought from far away. Not Important At All Not Important At All Free Somewhat Satisfied Better Health Services Very Good 3 2 2 Fairly Often Almost Never Almost Never Fairly Often
Host Whykong 8 Purbo Moheskhalia Para Na NA 21.04336 92.23361 Female Myself Widowed 57 Myself Shared Responsibility 5 6 1 Na No NA Na NA No Formal Education Concrete/Brick 4 Yes Owned Yes Yes No Wood 4 4 No Yes No Yes No No Yes Covid 19 False True True True True 27 Often Often Announced/Scheduled Sometimes Sometimes Sometimes Sometimes Sometimes Sometimes Sometimes Rarely Rarely Sometimes Often Stand Pipe Stand Pipe Stand Pipe 40 2 560 501-600 Mins No Yes The water tastes salty and the water turns red. 40 2 No Other No No No Yes No No 3 5 4 2 Dram, 2 Water Pot Multiple Months No No Yes Yes No No No No No No No No The water level goes down. Arranges water supply. None Of The Above No No No No No No No No Yes No No I Work For Somebody Else Stand Pipe Both Quantity & Quality The taste of water decreases and the color turns red. I have collected water from the house next door. Very Important Very Important Variable: Set Factors Somewhat Satisfied Safe Water Services Good 3 4 6 Fairly Often Fairly Often Fairly Often Very Often
Host Whykong 8 Purbo Moheskhalia Para Na NA 21.04334 92.23378 Female Myself Married 35 Myself Shared Responsibility 4 3 2 Na No NA Na NA No Formal Education Wood/Canvas/Plastic 3 Yes Owned Yes Yes No Wood 3 3 No No No Yes No No No NA False True True True True 26 Sometimes Sometimes Announced/Scheduled Sometimes Sometimes Sometimes Sometimes Sometimes Rarely Sometimes Rarely Sometimes Often Often Stand Pipe Other Surface Water Pond River Lake 30 2 420 401-500 Mins No Yes The taste becomes salty and the color turns red. 120 4 No Other No No No Yes No No 10 10 0 One Dram, 15 Small Bottol Multiple Months No No Yes Yes No No No No No No No No Going down the water level I arrange water supply. None Of The Above No No No No No No No No Yes No No I Work For Somebody Else Stand Pipe Both Quantity & Quality The amount decreases and the water turns red. I supply water from the next village. Very Important Very Important Fixed: Per Month Somewhat Satisfied Safe Water Services Fair 4 5 5 Fairly Often Fairly Often Fairly Often Fairly Often
Host Whykong 6 Jimonkhali Na NA 21.05931 92.22797 Male Myself Married 20 Myself Shared Responsibility 0 2 0 1991 No NA Na NA No Formal Education Wood/Canvas/Plastic 1 No Allocated/Given By Authorities No No No Gas Bottles 5 3 No No No No Yes No Yes I worked in agriculture and had accumulated rice and ate them. And some have had to borrow. True True True True True 18 Often Often Unexpected Sometimes Sometimes Rarely Rarely Rarely Rarely Sometimes Rarely Rarely Never Never Borehole/Tubewell Borehole/Tubewell Borehole/Tubewell 7 11 539 501-600 Mins No No NA 7 8 No Other No No No Yes No No 4 4 4 4 Water Jar Multiple Months No No No No No No Yes Yes Yes No No No Water naturally becomes salt. It is difficult to collect water from far away. Multiple Activities Yes No No No Yes No No No No Yes Yes I Own The Businesses Borehole/Tubewell Both Quantity & Quality Due to the low water content and the addition of water and salt, pure water and cooking are very difficult. We were bound to use this salty water. Very Important Very Important Free Somewhat Dissatisfied Better Health Services Very Good 6 5 7 Fairly Often Fairly Often Very Often Fairly Often

Mapping

Mapping

Interactive mapping of geospatial data is achieved using the leaflet library (Cheng, Karambelkar and Xie, 2022). Layer controls allow the user to select from three different base maps. Polygon layers can be selected to visualise the boundaries of refugee camps (ISCG, 2022) and administrative regions (OCHA, 2020). Additional data is imported to enable the user to visualise the locations of existing sanitation facilities (OCHA, 2022).

Mapped data points can be colour coded according to any variable in the dataset and an appropriate legend is created.

All points can be selected to display a tool-tip containing more detailed information.

# Read shape files
camps <- st_read(dsn = "../data/shapefiles/camp", layer = "T220130_RRC_Outline_Camp_AL1")
camps <- subset(camps, camps$Upazila=="Teknaf")
  
file<-tempdir()
unzip("../data/shapefiles/union/bgd_admbnda_adm4_bbs_20201113.zip", exdir=file)
unions <- st_read(dsn = file, layer = "bgd_admbnda_adm4_bbs_20201113")
unions <- subset(unions, unions$ADM4_EN %in% c("Baharchhara","Nhilla","Sabrang","Teknaf","Teknaf Paurashava","Whykong"))

file1<-tempdir()
unzip("../data/shapefiles/infra/WASH_Infras_LT_Bath_TW_May_31_2022A.zip", exdir=file1)
unzip("../data/shapefiles/infra/WASH_Infras_LT_Bath_TW_May_31_2022B.zip", exdir=file1)
infra <- st_read(dsn = file1, layer = "WASH_Infras_LT_Bath_TW_May_31_2022")
infra <- subset(infra, infra$Upazila=="Teknaf") %>% infra[c(1:33)]

# replacements for consistency
infra$Type_Faci <- sapply(infra$Type_Faci, function(x) {
  x = replace(x, which(x=="latrine"),"Latrine")
  x = replace(x, which(x=="Tubewell-Handpump"), "Handpump Tubewell")
  x = replace(x, which(x=="Both (Latrine & Bathing)"), "Latrine & Bathing")
})

# create base map
base_map <- leaflet(data = text_data) %>% 
  setView(lng = text_data[1, "gps-longitude"],
          lat = text_data[1, "gps-latitude"],
          zoom=12) %>%
  addTiles(group = "Default Map") %>%
  addProviderTiles("CartoDB.Positron", group = "Minimal Map") %>%
  addProviderTiles("Esri.WorldImagery", group = "Satelite Map")%>% 
  addScaleBar(position = "bottomleft",
              options = scaleBarOptions(maxWidth=400)) %>%
  addMiniMap(position = "bottomright", toggleDisplay = TRUE, width = 200) %>%
  
# add infrastructure markers  
  addCircleMarkers(data = infra[infra$Type_Faci=="Bathing Cubicle" | infra$Type_Faci=="Latrine & Bathing", ],
                       ~Long,
                       ~Lat,
                       radius = 3,
                       stroke = T,
                       weight = 1, opacity = 1,
                       fillOpacity = 0,
                       group = "Bathing Cubicle",
                       popup=~paste(
                         "<b>", "Agency: ",Agency,"</b><br/>",
                         "Facility: ", Type_Faci, "<br/>",
                         "Sub Type: ",Sub_Type_F, "<br/>",
                         "Bathing Total: ", Bathing, "<br/>",
                         "Female: ", Bathing_F, "<br/>",
                         "Male: ", Bathing_M, "<br/>",
                         "Universal: ", Bath_gen_u, "<br/>"))%>%
  addCircleMarkers(data = infra[infra$Type_Faci=="Latrine" | infra$Type_Faci=="Latrine & Bathing", ],
                       ~Long,
                       ~Lat,
                       radius = 3,
                       stroke = T,
                       weight = 1, opacity = 1,
                       fillOpacity = 0,
                       color="green",
                       group = "Latrine",
                       popup=~paste(
                         "<b>", "Agency: ",Agency,"</b><br/>",
                         "Facility: ", Type_Faci, "<br/>",
                         "Sub Type: ",Sub_Type_F, "<br/>",
                         "Latrines: ", LT, "<br/>",
                         "Female: ", LT_F, "<br/>",
                         "Male: ", LT_M, "<br/>",
                         "Universal: ", LT_Gen_uns, "<br/>",
                         "Slabs: ", Slabs, "<br/>",
                         "Rings: ", Rings, "<br/>",
                         "Structure: ", struc_wall, ", ", struc_pill, "<br/>",
                         "Volume (m3): ", Volume_M3, "<br/>"))%>%
  addCircleMarkers(data = infra[infra$Type_Faci=="Handpump Tubewell", ],
                       ~Long,
                       ~Lat,
                       radius = 3,
                       stroke = T,
                       weight = 1, opacity = 1,
                       fillOpacity = 0,
                       color="red",
                       group = "Handpump Tubewell",
                       popup=~paste(
                         "<b>", "Agency: ",Agency,"</b><br/>",
                         "Facility: ", Type_Faci, "<br/>",
                         "Sub Type: ",Sub_Type_F, "<br/>",
                         "TW_Depth: ", Depth_TW_F, "<br/>"))%>%
# add polygons
  addPolygons(data = camps,
                  color = "#800080", weight = 2, smoothFactor = 0.5, opacity = 1, 
                  fill= FALSE, group = "Camp Boundaries") %>%
  addPolygons(data = unions,
                  color = "#cc0000", weight = 2, smoothFactor = 0.5, opacity = 1, 
                  fill= FALSE, group = "Administrative Boundaries") %>%

# add UI toggles  
  addLayersControl(baseGroups = c("Default Map","Satelite Map", "Minimal Map"),
                       overlayGroups = c("Camp Boundaries", "Administrative Boundaries", "Bathing Cubicle", "Latrine", "Handpump Tubewell"),
                       options = layersControlOptions(collapsed = FALSE)) %>%
  hideGroup(c("Bathing Cubicle", "Latrine", "Handpump Tubewell"))

pal1 <- colorRampPalette(c("#cc0000","#ff1a1a","#ffa500","#ffd700","#00ff00","#00ced1","#003366"))
# colorData <- filtered_data()[[input$circle_color]]
color <- colorNumeric(rev(pal1(6)), domain = num_data[["hw"]])
legend_vals = text_data[["hw"]]
      
m1 <- base_map %>%
      addCircleMarkers(data = text_data,
                       text_data[, "gps-longitude"],
                       text_data[, "gps-latitude"],
                       radius = 4,
                       stroke = FALSE,
                       group = "points",
                       fillOpacity=1,
                       popup = ~paste(
                         "<b>", "HWISE Score: ","</b>",hw,"<br/>",
                         "<b>","Primary source (drinking): ","</b>",wat1, "<br/>",
                         "<b>","Primary source (non-drinking): ","</b>",wat3, "<br/>",
                         "<b>","Collection Time (drinking, weekly): ","</b>",wat5a, "<br/>",
                         "<b>","Treatment (drinking): ","</b>",wat12_, "<br/>",
                         "<b>","Quality problems: ","</b>",wat8, "<br/>",
                         "<b>","Supply interruptions: ","</b>",hw2, "<br/>",
                         "<b>","No useable/drinkable water: ","</b>",hw11, "<br/>",
                         "<b>","Borrow water: ","</b>",hw13, "<br/>"
                       ),
                       color = ~color(text_data[["hw"]])) %>%
      addLegend("topright",
                pal=color,
                values=legend_vals,
                opacity = 1,
                title=q_data[["hw"]],
                layerId="colorLegend")

Example Map (interactive)

An example map illustrating mapped points colour-coded according to overall water insecurity score.


Graphing

Graphing

Plots are created using the ggplot2 library (Wickham, 2016). Three chart types, commensurate with end user’s statistical knowledge, can be generated.

All displayed variables can be configured by the user including x and y variables (where relevant) as well as an optional third grouping variable.

All charts are rendered as interactive objects through the use of the plotly library (Sievert, 2020). Interactive features include:

  • Tool-tips displayed on hover-over.
  • Click and drag to zoom.
  • Clickable legend to toggle chart elements.
  • Export plot to .png

Example Plots (interactive)

# Set theme 
my_theme <- theme(plot.title = element_text(size=13, color="#464646",face="bold", hjust=0.5),
                  axis.title = element_text(size=13, color="#464646"),
                  axis.text = element_text(size=12, color="#464646"),
                  axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
                  legend.title = element_text(size=13, color="#464646",face="bold",hjust=0.5),
                  legend.text = element_text(size=12, color="#464646"),
                  legend.background = element_rect(fill = "#f2f2f2"),
                  panel.background = element_rect(fill = "#f2f2f2"),
                  plot.margin = margin(1,1,1,1, "cm"))

paletteFunc <- colorRampPalette(c("#18db50","#1d6cac","#00b4ff"))

Bar Plot

p1 <- # categorical x, not grouped
      num_data %>% 
      ggplot +
      aes(factor(hw2)) +
      geom_bar()+
      geom_bar(fill = paletteFunc(length(unique(num_data[["hw2"]])))) +
      scale_x_discrete(breaks = num_data[["hw2"]],
                       labels = text_data[["hw2"]]) +
      labs(title = paste0(q_data["hw2"]),
           x = NULL) +
      my_theme
      
ggplotly(p1, tooltip = "y")

Box Plot

p2 <- num_data %>% 
  ggplot +
  aes(x = factor(hw1), y = hw, fill = location)+
  geom_boxplot(varwidth = TRUE,
               alpha = 0.8,
               outlier.size = 2,
               outlier.shape = 8,
               outlier.colour = "black") +
  labs(title = paste0(q_data["hw1"]),
       x = NULL,
       y = paste0(q_data["hw"]),
       fill = "")+
    scale_fill_manual(values = paletteFunc(length(levels(factor(num_data[["location"]]))))) +
          scale_x_discrete(breaks = num_data[["hw1"]], 
                           labels = text_data[["hw1"]]) +
          #scale_y_continuous(breaks = seq(0,max(y,na.rm=T),ceiling(max(y,na.rm=T)/10))) +
          my_theme
  
ggplotly(p2) %>% layout(boxmode = "group")

Scatter Plot

p3 <- num_data %>%
  ggplot +
  aes(x = wat5, y = hw, color = camp) +
  geom_jitter(size = 1, width=0.1, height=0.1 )+
  geom_smooth(method = lm, color = "#ffa500") +
  labs(title = paste0(q_data[["wat5"]], " ~ ",q_data[["hw"]]),
       x = paste0(q_data[["wat5"]]),
       y = paste0(q_data[["hw"]]),
       color = paste0(q_data[["camp"]]))  + 
  scale_color_manual(values = paletteFunc(length(unique(num_data[["camp"]])))) + 
  my_theme
  
ggplotly(p3) %>% layout(legend = list(font = list(size = 12)))

Summary Statistics

Summary Statistics

Summary statistics are generated using the skimR library (Waring et al., 2022). Given the end users’ limited statistical knowledge, extensive descriptive statistics are not required. Range, mean, median and standard deviation provide a sufficient summary of central tendency and variation of data that is in-line with users’ existing knowledge.

Statistics for numeric and categorical data are displayed separately and provide the user with greater control over which data are displayed. An optional grouping variable can be selected and applied to display summary statistics according to specified subgroups.

# Define custom skim (display full names for factor levels in cat data)
my_skim <- skim_with(factor = sfl(top_counts = ~top_counts(., max_char = 25, max_levels = 50)),numeric = list(hist = NULL))

# create all summary stats
sum_df <- text_data %>%
      mutate(across(which(sapply(.,class)!="numeric"),factor)) %>%
      select(-c("gps-longitude","gps-latitude",sd11_3,w8,wat8,wat16,wat17,liv5,liv6)) %>%
      #group_by_at(group_val) %>%
      my_skim %>%
      focus(c(
        skim_variable,
        #all_of(group_val),
        n_missing,
        numeric.mean,
        numeric.p0,
        numeric.p50,
        numeric.p100,
        numeric.sd,
        factor.n_unique,
        factor.top_counts))

# rename columns
sum_df <- data.frame(sum_df) %>%
  rename("Type" = "skim_type",
         "Variable" = "skim_variable",
         "Missing" = "n_missing",
         "Mean" = "numeric.mean",
         "Min" = "numeric.p0",
         "Median" = "numeric.p50",
         "Max" = "numeric.p100",
         "Standard Deviation" = "numeric.sd",
         "Category Levels" = "factor.n_unique",
         "Category Level Counts" = "factor.top_counts")

# round numeric data
sum_df <- sum_df %>% mutate_if(is.numeric, round, 1)

Numeric Data

# numeric data
sum_num <- sum_df%>%
  filter(Type == "numeric") %>%
  select(-c(`Category Levels`,`Category Level Counts`, `Type`))%>%
  mutate(Question = sapply(Variable, function(x) q_data[[x]])) %>% # add question text column
  relocate(Question, .after = Variable)

kable(sum_num) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
Variable Question Missing Mean Min Median Max Standard Deviation
sd3 Age 0 38.4 17 36 80 11.7
sd6_ Household: Children Under 18 0 3.2 0 3 11 1.8
sd7 Household: Adult Members 0 2.9 0 2 14 1.6
sd8 Household: Elderly Members 0 1.1 0 1 10 1.1
sd14 House: No. of Rooms 0 2.5 1 2 12 1.0
sd21 Rate Community: Socioeconomic Standing 0 2.7 0 2 8 1.5
sd22 Rate Community: Water Situation 0 3.0 0 3 9 1.8
hw Calculated HWISE Score 0 19.0 0 19 39 9.7
wat4 Drinking Water: Time to Source (mins) 0 15.3 0 10 120 17.2
wat5 Drinking Water: No.of Trips 0 2.6 0 2 23 2.7
wat5a Drinking Water: Collection Time Per Week (mins) 0 295.2 0 140 8400 623.2
wat9 Non-drinking Water: Time to Source (mins) 0 15.6 0 10 135 17.7
wat10 Non-drinking Water: No.of Trips 0 3.0 0 2 20 2.6
wat14 Kolosh 0 2.7 0 2 10 1.6
wat14_1 Buckets 0 3.0 0 3 11 1.7
wat14_2 Jerry Cans 0 1.3 0 0 15 1.9
wb4 No. Days Poor Physical Health 0 4.6 0 4 60 4.8
wb5 No. Days Poor Mental Health 0 4.1 0 2 30 4.8
wb6 No. Days Health Prevented Normal Activities 0 3.1 0 2 30 3.8

Categorical Data

# categorical data 
sum_cat <- sum_df%>%
  filter(Type == "factor") %>%
  select(-c(`Type`,`Mean`,`Min`,`Median`, `Max`, `Standard Deviation`))%>%
  mutate(Question = sapply(Variable, function(x) q_data[[x]])) %>% # add question text column
  relocate(Question, .after = Variable)

kable(sum_cat) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
Variable Question Missing Category Levels Category Level Counts
location Refugee Camp or Host Community 0 2 Host: 235, Camp: 225
union Union 0 7 NA: 225, Sabrang: 60, Whykong: 59, Nhila: 56, Teknaf: 25, Teknaf Municipality: 25, Baharchhara: 10
ward Ward 0 10 NA: 225, 5: 56, 3: 44, 1: 41, 6: 41, 2: 28, 4: 12, 8: 6, 7: 5, 9: 2
village Village 0 53 NA: 225, Puran Para: 23, West Sikdar Para: 21, Natmura Para: 13, Pahartoli: 9, Jimonkhali: 8, Kataboniya Para: 8, College Para: 7, Digliya Para: 7, Ghula Para: 7, Naytong Para: 7, Purbo Moheskhalia Para: 7, East Pahadtoli: 6, Hariya Khali: 6, Fuler Deil: 5, Jahaliya Para: 5, Solish Para: 5, Balukhali: 4, Bazar Para: 4, Horaci Para: 4, Katakhali: 4, Lombabil: 4, Moheshkalia Para: 4, Monir Ghuna: 4, Poran Pollon Para: 4, Uncchiprang: 4, Adorsho Gram: 3, Chandoli Para: 3, Chowdhury Para: 3, East Natmora Para: 3, Habir Chora: 3, Keti Bil: 3, Panchori Para: 3, Uttar Noya Para: 3, Uttor Para: 3, Ajiya Khali: 2, Dail Para: 2, Diallarbil: 2, Ghunar Para: 2, Islamabad: 2, Kanjor Para: 2, Kosobaniya Para: 2, Major Para: 2, Puran Bazar: 2, Puraton Pollan Para: 2, Choto Habi Para: 1, Jalia Para: 1, Kharangkhali: 1, Lafarghona: 1, Lengur Bill: 1
camp Camp 0 6 Na: 235, Camp 26: 71, Camp 24: 58, Nayapara Refugee Camp: 49, Camp 27: 24, Camp 25: 23
block Block 0 8 NA: 235, C: 67, D: 64, B: 36, A: 27, G: 16, E: 13, F: 2
sex Sex 0 2 Female: 284, Male: 176
sd1 Head Of Hh 0 5 Myself: 266, My Spouse: 182, Other: 6, A Grandparent: 4, An Adult Child: 2
sd2 Relationship Status 0 4 Married: 417, Widowed: 29, Divorced: 8, Single: 6
sd4 Household: Responsible To Get Water 0 4 Myself: 238, My Spouse: 195, Other: 16, Adult Child: 11
sd5_ Household: Ensures Sufficient Water 0 6 Shared Responsibility: 267, Self: 110, Spouse: 64, Children: Girl: 10, Other: 6, Children: Boy: 3
sd9 Left Myanmar 0 33 Na: 235, 2017: 115, 1992: 20, 1991: 10, 2018: 7, 2007: 6, 1990: 5, 2008: 5, 1987: 4, 1989: 4, 2003: 4, 2006: 4, 2014: 4, 1994: 3, 1995: 3, 2001: 3, 2009: 3, 1988: 2, 1993: 2, 1996: 2, 1997: 2, 1998: 2, 2002: 2, 2005: 2, 2012: 2, 2016: 2, 1980: 1, 1981: 1, 2000: 1, 2004: 1, 2010: 1, 2013: 1, 2015: 1
sd11 Financial Assistance 0 2 No: 366, Yes: 94
sd11_1 Financial Assistance: Amount 0 25 NA: 366, 0: 46, 1050: 16, 500: 5, 700: 3, 750: 3, 4000: 2, 6000: 2, 1000: 1, 10000: 1, 1015: 1, 1030: 1, 1500: 1, 2000: 1, 20000: 1, 2400: 1, 28000: 1, 3000: 1, 30000: 1, 4500: 1, 5000: 1, 5200: 1, 5600: 1, 8000: 1, 9000: 1
sd11_2 Financial Assistance: Source 0 4 Na: 367, Ngo: 75, Government: 14, Old Age Allowance: 4
sd12 Highest Level Of Education 0 4 No Formal Education: 265, Primary School: 141, Secondary School: 48, University College: 6
sd13 House: Type 0 4 Wood/Canvas/Plastic: 279, Other: 82, Concrete/Brick: 79, Processed Wood: 20
sd15 House: Has a Garden 0 2 No: 371, Yes: 89
sd16 House: Ownership 0 4 Owned: 226, Allocated/Given By Author: 176, Rented: 55, Other: 3
sd17 House: Electricity Supply 0 2 Yes: 242, No: 218
sd18 House: Piped Water Supply 0 2 No: 335, Yes: 125
sd19 House: Sewerage Connection 0 2 No: 352, Yes: 108
sd20 Primary Fuel Source 0 2 Gas Bottles: 270, Wood: 190
w1_1 Income Source: Employment In Government 0 2 No: 459, Yes: 1
w2 Income Source: Employment In Private Sector 0 2 No: 413, Yes: 47
w3 Income Source: Casual Labour 0 2 Yes: 328, No: 132
w4 Income Source: Own Business 0 2 No: 378, Yes: 82
w5 Income Source: Farming 0 2 No: 382, Yes: 78
w6 Income Source: From Family Relative 0 2 No: 445, Yes: 15
w7 Difficult To Obtain Sufficient Income 0 2 Yes: 317, No: 143
p1 Water&Sanitation Needs Being Met 0 2 False: 260, True: 200
p2 Women & Men: Equal Responsibility for Sanitation 0 2 True: 332, False: 128
p3 Women & Men: Equal Awareness of Feedback Processes 0 2 True: 316, False: 144
p4 Women & Men: Feedback Equally Valued 0 2 True: 350, False: 110
p5 Women & Men: Equal Awareness of Sanitation Rights 0 2 True: 352, False: 108
hw1 Worry About Water Supply 0 5 Sometimes: 154, Often: 123, Rarely: 93, Never: 56, Always: 34
hw2 Supply Interruptions 0 5 Sometimes: 171, Rarely: 113, Often: 108, Never: 53, Always: 15
hw2a Supply Interruptions: Expected or Unexpected 0 3 Unexpected: 250, Announced/Scheduled: 157, NA: 53
hw3 Unable to do Laundry Due to Water Situation 0 5 Sometimes: 169, Often: 117, Rarely: 101, Never: 59, Always: 14
hw4 Schedule Change Due to Water Situation 0 5 Sometimes: 183, Rarely: 140, Never: 69, Often: 63, Always: 5
hw5 Change What Was Eaten Due to Water Situation 0 5 Sometimes: 165, Rarely: 146, Never: 81, Often: 64, Always: 4
hw6 Unable to Wash Hands After Dirty Activity 0 4 Rarely: 183, Never: 121, Sometimes: 98, Often: 58
hw7 Unable to Wash Body Due to Water Situation 0 5 Sometimes: 171, Rarely: 136, Never: 84, Often: 65, Always: 4
hw8 Not Enough Water to Drink 0 5 Rarely: 146, Sometimes: 146, Never: 101, Often: 62, Always: 5
hw9 Felt Anger About Water Situation 0 5 Rarely: 154, Sometimes: 134, Never: 125, Often: 46, Always: 1
hw10 Gone to Sleep Thirsty 0 5 Rarely: 177, Never: 165, Sometimes: 83, Often: 33, Always: 2
hw11 No Useable or Drinkable Water 0 5 Sometimes: 157, Rarely: 153, Never: 87, Often: 58, Always: 5
hw12 Felt Shame About Water Situation 0 5 Rarely: 169, Sometimes: 125, Never: 99, Often: 61, Always: 6
hw13 Asked to Borrow Water 0 5 Rarely: 162, Sometimes: 148, Never: 86, Often: 56, Always: 8
wat1 Drinking Water: Primary Source 0 9 Stand Pipe: 217, Borehole/Tubewell: 119, Piped Water To Dwelling: 93, Other Person: 15, Protected Dug Well: 7, Other: 4, Small Water Vendor: 2, Tanker Truck: 2, Surface Water Pond River : 1
wat2 Drinking Water: Secondary Source 0 11 Stand Pipe: 143, Borehole/Tubewell: 110, Piped Water To Dwelling: 88, Other Person: 63, Surface Water Pond River : 24, Protected Dug Well: 12, Other: 7, Unprotected Dug Well: 5, Bottled Water: 4, Rainwater Collection: 2, Small Water Vendor: 2
wat3 Non-drinking Water: Primary Source 0 10 Stand Pipe: 160, Borehole/Tubewell: 121, Piped Water To Dwelling: 97, Surface Water Pond River : 40, Protected Dug Well: 14, Other Person: 12, Other: 7, Unprotected Dug Well: 6, Rainwater Collection: 2, Tanker Truck: 1
wat5b Drinking Water: Collection Time Per Week (grouped) 0 7 0-100 Mins: 189, 201-300 Mins: 70, 101-200 Mins: 69, 600+ Mins: 59, 401-500 Mins: 33, 301-400 Mins: 23, 501-600 Mins: 17
wat6 Injured While Fetching Water 0 2 No: 318, Yes: 142
wat7 Problem With Water Quality 0 2 No: 301, Yes: 159
wat11 Treated Water To Make It Safer 0 2 No: 318, Yes: 142
wat12_ Drinking Water Treatment 0 6 None: 217, Add Chemicals/Chlorine: 116, Other: 47, Combined Methods: 46, Boil: 17, Filter: 17
wat12_1 Drinking Water Treatment: Boil 0 2 No: 407, Yes: 53
wat12_2 Drinking Water Treatment: Filter 0 2 No: 414, Yes: 46
wat12_3 Drinking Water Treatment: Add Chemicals Chlorine Tablets 0 2 No: 318, Yes: 142
wat12_4 Drinking Water Treatment: Other Specify 0 2 No: 411, Yes: 49
wat12_5 Drinking Water Treatment: Do Nothing 0 2 No: 243, Yes: 217
wat12_6_ Drinking Water Treatment: Multiple Methods 0 2 No: 414, Yes: 46
wat14_3 Other 0 66 1 Drum: 175, 0: 84, 2 Drums: 58, No: 19, 3 Drums: 17, Drum: 7, Drum-2: 7, Drum 1: 7, 4 Drums: 5, Drum-1: 5, N/A: 5, Drum 2: 4, 2 Drum: 3, 5 Drums: 3, Na: 3, 1 Jug: 2, 2 Drums And 2 Jars: 2, 2 Jugs: 2, 5 Bottles: 2, One Dram: 2, Two Drum: 2, Two Drums: 2, 1 Bowl: 1, 1 Bowl, 1 Drum: 1, 1 Container And 1 Drum: 1, 1 Drams: 1, 1 Drum And 1 Jar: 1, 1 Drum And 2 Jar: 1, 1 Drum And 2 Tanks: 1, 1 Drum And 4 Tanks: 1, 1 Drum And Several Bottle: 1, 1 Drum Set: 1, 1 Drum, 2 Jugs: 1, 1 Jug And 4 Bottles: 1, 1 Little Tank: 1, 1 Water Jar: 1, 1 Water Tank: 1, 2 Dram, 2 Water Pot: 1, 2 Drams: 1, 2 Druns: 1, 2 Jar And 1 Drum: 1, 3 Bottles And 1 Container: 1, 3 Drams: 1, 3 Jugs: 1, 4 Bottles: 1, 4 Jars: 1, 4 Tank: 1, 4 Water Jar: 1, Berl 2: 1, Big Jars: 1
wat15_ Fewest Water Problems 0 5 Multiple Months: 449, Jul: 5, Apr: 4, Mar: 1, May: 1
wat15_1 Fewest Water Problems: January 0 2 No: 417, Yes: 43
wat15_2 Fewest Water Problems: February 0 2 No: 412, Yes: 48
wat15_3 Fewest Water Problems: March 0 2 No: 330, Yes: 130
wat15_4 Fewest Water Problems: April 0 2 No: 317, Yes: 143
wat15_5 Fewest Water Problems: May 0 2 No: 355, Yes: 105
wat15_6 Fewest Water Problems: June 0 2 Yes: 234, No: 226
wat15_7 Fewest Water Problems: July 0 2 Yes: 314, No: 146
wat15_8 Fewest Water Problems: August 0 2 Yes: 248, No: 212
wat15_9 Fewest Water Problems: September 0 2 No: 303, Yes: 157
wat15_10 Fewest Water Problems: October 0 2 No: 391, Yes: 69
wat15_11 Fewest Water Problems: November 0 2 No: 407, Yes: 53
wat15_12 Fewest Water Problems: December 0 2 No: 410, Yes: 50
liv1_ Income Generation 0 8 None Of The Above: 281, Other: 69, Multiple Activities: 46, Crop Production: 22, Livestock Production: 19, Betel Nut/Leaf Production: 14, Sea Fishing: 7, Salt Production: 2
liv1_1 Income Generation: Crop Production 0 2 No: 415, Yes: 45
liv1_2 Income Generation: Betel Nut Leaf Production 0 2 No: 426, Yes: 34
liv1_3 Income Generation: Livestock Production 0 2 No: 419, Yes: 41
liv1_4 Income Generation: Salt Production 0 2 No: 456, Yes: 4
liv1_5 Income Generation: Sea Fishing 0 2 No: 450, Yes: 10
liv1_6 Income Generation: Aquaculture Shrimp Farming 0 1 No: 460
liv1_7 Income Generation: Aquaculture Fish Farming 0 1 No: 460
liv1_8 Income Generation: Other 0 2 No: 371, Yes: 89
liv1_9 Income Generation: None Of Above 0 2 Yes: 293, No: 167
liv1_10_ Income Generation: Multiple Activities 0 2 No: 414, Yes: 46
liv2 Do You Own Businesses 0 3 NA: 246, Yes I Own The Businesses: 112, No I Work For Somebody El: 102
liv3 Income Generation: Primary Water Source 0 11 Na: 128, Other: 125, Stand Pipe: 75, Piped Water To Dwelling: 50, Borehole/Tubewell: 42, Surface Water Pond River : 17, Other Person: 9, Protected Dug Well: 7, Rainwater Collection: 5, Small Water Vendor: 1, Unprotected Dug Well: 1
liv4_ Income Generation: Problematic Water Quantity or Quality 0 4 No: 251, Quantity: 111, Both Quantity & Quality: 55, Quality: 43
pay1 Important To Pay To Install Your Own Water System 0 4 Very Important: 278, Not Important At All: 108, Somewhat Important: 65, Neither Imp Nor Unimporta: 9
pay2 Important To Pay For Regular Costs 0 4 Very Important: 172, Not Important At All: 155, Somewhat Important: 126, Neither Imp Nor Unimporta: 7
pay3 Fee Structure For Use Of Main System 0 7 Free: 344, Fixed: Per Month: 48, Variable: Set Factors: 27, No Fixed Payment: 22, Variable: Per Amount: 11, Fixed Cost Per Amount : 6, Dont Know: 2
wb1 How Do You Feel About Your Life 0 4 Somewhat Satisfied: 247, Somewhat Dissatisfied: 138, Very Dissatisfied: 47, Very Satisfied: 28
wb2 Change Would Make Biggest Positive Difference 0 5 Safe Water Services: 353, Better Health Services: 49, Safe Return To Myanmar: 21, Access To Education: 20, Other: 17
wb3 General Health 0 5 Fair: 199, Good: 124, Very Good: 62, Excellent: 59, Poor: 16
wb7 Feel Unable to Control Important Things 0 4 Fairly Often: 288, Almost Never: 89, Never: 47, Very Often: 36
wb8 Feel Confident in Ability to Control Problems 0 4 Fairly Often: 294, Almost Never: 88, Very Often: 53, Never: 25
wb9 Feel Things Going Your Way 0 4 Fairly Often: 291, Almost Never: 84, Never: 50, Very Often: 35
wb10 Feel Difficulties Could Not Be Overcome 0 4 Fairly Often: 275, Almost Never: 108, Very Often: 45, Never: 32

Grouped Statistics

The optional grouping variable can be applied to statistics of either variable class. Below demonstrates numeric statistics grouped using the ‘location’ variable:

# create all summary stats
group_df <- text_data %>%
      mutate(across(which(sapply(.,class)!="numeric"),factor)) %>%
      select(-c("gps-longitude","gps-latitude",sd11_3,w8,wat8,wat16,wat17,liv5,liv6)) %>%
      group_by_at("location") %>%
      my_skim %>%
      focus(c(
        skim_variable,
        all_of("location"),
        n_missing,
        numeric.mean,
        numeric.p0,
        numeric.p50,
        numeric.p100,
        numeric.sd,
        factor.n_unique,
        factor.top_counts))

# rename columns
group_df <- data.frame(group_df) %>%
  rename("Type" = "skim_type",
         "Variable" = "skim_variable",
         "Location" = "location",
         "Missing" = "n_missing",
         "Mean" = "numeric.mean",
         "Min" = "numeric.p0",
         "Median" = "numeric.p50",
         "Max" = "numeric.p100",
         "Standard Deviation" = "numeric.sd",
         "Category Levels" = "factor.n_unique",
         "Category Level Counts" = "factor.top_counts")

# round numeric data
group_df <- group_df %>% mutate_if(is.numeric, round, 1)

# group by location
sum_group <- group_df%>%
  filter(Type == "numeric") %>%
  select(-c(`Category Levels`,`Category Level Counts`, `Type`))%>%
  mutate(Question = sapply(Variable, function(x) q_data[[x]])) %>% # add question text column
  relocate(Question, .after = Variable)

kable(sum_group) %>%
  kable_styling("striped", "condensed", full_width = FALSE) %>%
  scroll_box(width = "100%", height = "400px")
Variable Question Location Missing Mean Min Median Max Standard Deviation
sd3 Age Camp 0 38.1 17 36 76 12.2
sd3 Age Host 0 38.7 18 37 80 11.1
sd6_ Household: Children Under 18 Camp 0 3.2 0 3 11 1.8
sd6_ Household: Children Under 18 Host 0 3.2 0 3 10 1.8
sd7 Household: Adult Members Camp 0 2.9 1 2 14 1.5
sd7 Household: Adult Members Host 0 3.0 0 2 12 1.7
sd8 Household: Elderly Members Camp 0 1.1 0 1 6 1.0
sd8 Household: Elderly Members Host 0 1.1 0 1 10 1.1
sd14 House: No. of Rooms Camp 0 2.4 1 2 8 0.8
sd14 House: No. of Rooms Host 0 2.6 1 2 12 1.1
sd21 Rate Community: Socioeconomic Standing Camp 0 2.4 1 2 8 1.2
sd21 Rate Community: Socioeconomic Standing Host 0 3.1 0 3 8 1.6
sd22 Rate Community: Water Situation Camp 0 2.5 1 2 7 1.2
sd22 Rate Community: Water Situation Host 0 3.5 0 3 9 2.0
hw Calculated HWISE Score Camp 0 22.2 0 22 39 7.9
hw Calculated HWISE Score Host 0 15.9 0 16 36 10.2
wat4 Drinking Water: Time to Source (mins) Camp 0 18.9 0 10 120 18.3
wat4 Drinking Water: Time to Source (mins) Host 0 11.8 0 5 120 15.4
wat5 Drinking Water: No.of Trips Camp 0 2.1 0 2 20 2.5
wat5 Drinking Water: No.of Trips Host 0 3.1 0 3 23 2.8
wat5a Drinking Water: Collection Time Per Week (mins) Camp 0 348.4 0 140 8400 833.7
wat5a Drinking Water: Collection Time Per Week (mins) Host 0 244.3 0 140 1680 302.0
wat9 Non-drinking Water: Time to Source (mins) Camp 0 18.4 0 10 90 16.8
wat9 Non-drinking Water: Time to Source (mins) Host 0 12.9 0 5 135 18.2
wat10 Non-drinking Water: No.of Trips Camp 0 2.5 0 2 20 2.5
wat10 Non-drinking Water: No.of Trips Host 0 3.5 0 3 20 2.6
wat14 Kolosh Camp 0 3.1 0 3 9 1.6
wat14 Kolosh Host 0 2.3 0 2 10 1.5
wat14_1 Buckets Camp 0 3.2 0 3 11 1.8
wat14_1 Buckets Host 0 2.7 0 2 10 1.6
wat14_2 Jerry Cans Camp 0 1.6 0 1 10 2.0
wat14_2 Jerry Cans Host 0 1.0 0 0 15 1.7
wb4 No. Days Poor Physical Health Camp 0 4.8 0 4 60 5.5
wb4 No. Days Poor Physical Health Host 0 4.5 0 4 30 4.0
wb5 No. Days Poor Mental Health Camp 0 4.1 0 2 25 5.0
wb5 No. Days Poor Mental Health Host 0 4.1 0 3 30 4.6
wb6 No. Days Health Prevented Normal Activities Camp 0 3.0 0 2 30 3.7
wb6 No. Days Health Prevented Normal Activities Host 0 3.2 0 2 30 4.0

References

Chang, W. et al. (2022) Shiny: Web application framework for r [online]. Available from: https://CRAN.R-project.org/package=shiny.
Cheng, J., Karambelkar, B. and Xie, Y. (2022) Leaflet: Create interactive web maps with the JavaScript ’leaflet’ library [online]. Available from: https://CRAN.R-project.org/package=leaflet.
Demin, G. (2021) Excel.link: Convenient Data Exchange with Microsoft Excel [online]. Available from: https://CRAN.R-project.org/package=excel.link.
ISCG (2022) Bangladesh - Outline of camps of Rohingya refugees in Cox’s Bazar - Humanitarian Data Exchange [online]. Available from: https://data.humdata.org/dataset/outline-of-camps-sites-of-rohingya-refugees-in-cox-s-bazar-bangladesh.
OCHA (2020) Bangladesh - Subnational Administrative Boundaries - Humanitarian Data Exchange [online]. Available from:https://data.humdata.org/dataset/cod-ab-bgd .
OCHA (2022) WASH Infrastructures GPS Dataset & Map (latrine, bathing, tube-well)_may_31_2022 [online]. Available from: https://www.humanitarianresponse.info/en/operations/bangladesh/document/wash-infrastructures-gps-dataset-ltbathingtwmarch312022.
Sievert, C. (2020) Interactive web-based data visualization with r, plotly, and shiny [online]. Chapman; Hall/CRC. Available from: https://plotly-r.com.
Waring, E., Quinn, M., McNamara, A., Arino de la Rubia, E., Zhu, H. and Ellis, S. (2022) Skimr: Compact and flexible summaries of data [online]. Available from: https://CRAN.R-project.org/package=skimr.
Wickham, H. (2016) ggplot2: Elegant graphics for data analysis [online]. Springer-Verlag New York. Available from: https://ggplot2.tidyverse.org.
Wickham, H. (2022) Stringr: Simple, consistent wrappers for common string operations [online]. Available from: https://CRAN.R-project.org/package=stringr.